-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk load data conversion error (truncation) for concept_synonym_name #1047
Comments
Hi @stpatnoe, I think the error in this case is related to the fact that the field concept_synonym_name in the concept_synonym table has the VARCHAR(1000) type, while the concept 1251539 has a synonym name truncated to a length of 1000 and the name itself contains Unicode characters Ξ. As a solution to this issue, I would suggest changing the data type of the concept_synonym_name field from VARCHAR(1000) to NVARCHAR(1000), since NVARCHAR is designed to work with Unicode text. |
Somebody in the vocab team should remove that "synonym" altogether. |
Maybe, but it wouldn't solve the DDL issue. |
Switching to NVARCHAR(1000) did not resolve the issue for me. It looks like there are 54 concepts in the 'OMOP Invest Drug' vocabulary_id that have more than 1000 characters. I was able to use the bulk insert after updating the concept_synonym_name column to NVARCHAR(MAX): ALTER TABLE CONCEPT_SYNONYM ALTER COLUMN concept_synonym_name nvarchar(max) not null TRUNCATE TABLE CONCEPT_SYNONYM; |
While running the SQL server BULK INSERT script for the CONCEPT_SYNONYM table for the latest vocabulary (v5.0 30-AUG-24), I'm getting the following error message for multiple rows:
Msg 4863, Level 16, State 1, Line 85
Bulk load data conversion error (truncation) for row 420061, column 2 (concept_synonym_name).
The row referenced in the error message above is for concept_id 1251539. It looks like the concept_synonym_name field was cut at 1000 characters but then continues on a second line?
The bulk insert script I'm using is the same format as the one provided below from this link.
TRUNCATE TABLE CONCEPT_SYNONYM;
BULK INSERT CONCEPT_SYNONYM
FROM 'C:\CDMV5VOCAB\CONCEPT_SYNONYM.csv'
WITH (
FIRSTROW = 2,
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '0x0a',
ERRORFILE = 'C:\CDMV5VOCAB\CONCEPT_SYNONYM.bad',
TABLOCK
);
The text was updated successfully, but these errors were encountered: