Free Information Technology Magazines and eBooks

Monday, May 18, 2009

VB.NET: How to read a CSV file into an array

How to read a CSV fileA Comma separated values (CSV) file is a computer data file commonly used for exchanging information, for example between a database program and a spreadsheet program. Each line in the CSV file corresponds to a row in the table. Within a line, fields are separated by commas, each field belonging to one table column. Although Comma-separated value lists are very old technology and predate personal computers by more than a decade, it is still widely used today by many applications.


On this article, you will learn how to read a CSV file and put the values in an array. The sample code below uses StreamReader to load the file so you have to include the following imports:


Imports System.IO



Function in reading CSV file using VB.NET

Private Sub ReadCSVFileToArray()
Dim strfilename As String
Dim num_rows As Long
Dim num_cols As Long
Dim x As Integer
Dim y As Integer
Dim strarray(1, 1) As String

' Load the file.
strfilename = "test.csv"

'Check if file exist
If File.Exists(strfilename) Then
Dim tmpstream As StreamReader = File.OpenText(strfilename)
Dim strlines() As String
Dim strline() As String

'Load content of file to strLines array
strlines = tmpstream.ReadToEnd().Split(Environment.NewLine)

' Redimension the array.
num_rows = UBound(strlines)
strline = strlines(0).Split(",")
num_cols = UBound(strline)
ReDim strarray(num_rows, num_cols)

' Copy the data into the array.
For x = 0 To num_rows
strline = strlines(x).Split(",")
For y = 0 To num_cols
strarray(x, y) = strline(y)
Next
Next


' Display the data in textbox
For x = 0 To num_rows
For y = 0 To num_cols
TextBox1.Text = TextBox1.Text & strarray(x, y) & ","
Next
TextBox1.Text = TextBox1.Text & Environment.NewLine
Next

End If

End Sub


Here is the result of invoking the VB.NET function:

CSV file to array

Aside from storing the CSV into array, you can also directly store the parsed comma-separated values into database. This method is very useful in integrating two or more systems with different platforms.

You can download the CSV file to array sample project here.

For more VB.NET Coding Tips and Tricks, subscribe now

7 comments:

Rob said...

This is great if you assume that no column in your csv file has an embedded comma. For example, if you parse a line with a column of CompanyName and the name = "Joes Garage, Inc"...your parse routine will throw "Inc" into a separate array member and subsequently throw off processing.

Nisar Khan said...

Well the logic is that incase there is a comma, the split would include quotation marks. if there is a quotatoin mark in your csv file, the split would contain two quotation marks.

regds,
Nisar

Anonymous said...

I would never use
TextBox1.Text = TextBox1.Text & strarray(x, y) & ","

I replaced it with
sb.Append(strarray(x, y))
sb.Append(",") and sb.AppendLine("") where appropriate

TextBox1.Text = sb.ToString

I hit performnace issues with 32,000 rows.

Anonymous said...

Thank you for posting this, it is a clean and nice example.

Anonymous said...

from the second row on, the data in the first column contains a CrLf... any suggestions to remove it?

thanks

barry.b said...

@ Nisar Khan

What Rob said (first comment) still holds true:

"test data only", 123456, "more text"
"test data, still", 567890, "more text, still"

simply using a string split on "," won't take into account the fact that the second row (above) has it's comma embedded in the text. Both should produce three fields, although the parsing as you see have it will incorrectly create six fields.

What's needed is to intelligently ignore any commas (as a field delimiter) if it's surrounded by double-quotes (the text delimiter).

Personally, I can't see how you can do this without RegEx and it's ability to look ahead (once finding the start of the text with the text delimiter, look for the matching end one and ignore all commas within that text).

thoughts?

JD said...

I had the same issue with the CrLf that anonymous had. A simple replace will work (my extra character was actually just a Lf, not a CrLf).

strarray(x, y) = strline(y).replace(vbLf,"")