Bug 265

Summary: [DB] Handle comma in csv data
Product: MovieXXX Reporter: tomfong521
Component: MovieAnalysisEngineAssignee: salinang3-c
Status: RESOLVED FIXED    
Severity: enhancement    
Priority: Normal    
Version: 1.0   
Hardware: PC   
OS: Windows   
Deadline: 2018-11-17   

Description tomfong521 2018-11-29 13:07:55 HKT
Details:
Since .csv is actually rows of data separated by a comma, the current reading method is to spilt each line by comma. While there might be comma as part of data (e.g. film name with comma), in such case the reading is not properly performed as expected.

Current workaround
- Remove all rows that contain data with comma

Expected behaviour
- .csv file which contains data with comma can be read properly
Comment 1 salinang3-c 2018-11-29 23:18:17 HKT
How to enhance:
- In .csv file, a data containing comma will be surrounded with a pair of double quotation mark (")
- Spilt line by comma which is not between a pair of double quotation mark when reading .csv file
- Changed to use regex pattern ",(?![^\"]*\")" to split line from .csv file

Expected behaviour
- .csv file which contains data with comma can be read properly

Potential affected areas
- DB handling

Affected versions
- >= v2

Testing steps
1. Edit a test database with data containing comma in it
2. Perform searching targeting the movie with info containing comma
3. Print searchResult and check if the Film class objects are constructed correctly

Expected results
- At testing step (3), all Film class objects can be constructed correctly

Testing results
- At testing step (3), all Film class objects are constructed correctly