When I started out as a programmer all those years ago, I was coding in COBOL using IBM Mainframes running MVS with IMS and DB2 Databases. Those were the days! Of course, the death of COBOL has always been greatly exaggerated. Every once in a while I get a call from a recruiter asking about my COBOL skills on my resume. But there is still a huge amount of data stored on Mainframes across the world that needs to be loaded into our Data Warehouses, so Microsoft has introduced some new collations in SQL Server 2008 to support this data.
My COBOL skills are a bit rusty and I have not been tempted to learn COBOL.Net as a dialect. (Yes, this exists, I am not kidding…). In fact, one of the reasons I got into databases was to get away from the Procedure Division of COBOL programs. So when a student asked me about EBCDIC support in SQL Server 2008, it brought back lots of memories. Of course, most of our flat files on the Microsoft platform are ASCII format, but there needs to be a way of translating the Mainframe data in IBM’s EBCDIC format to the ASCII format.
SQL Server Integration Services is responsible for the extract, transform and load of data, so surely it has some mechanisms for this conversion. Well, the trouble is, there are many forms of EBCDIC that were created before the world wide web and global sharing of data was a real concept. Tools like Informatica support these multiple formats but SSIS is relatively new to the market and is only now being considered as an industrial strength ETL tool. So there were a few holes there.
Checking out SSIS 2008, I counted 31 different EBCDIC formats now supported from Turkish to Thai to Modern Greek. An important one supported is EBCDIC code page 500 which is for International. However, there is more to conversion than selecting the correct code page. In EBCDIC there are packed decimal fields, Redefines, Occurs, Occurs Depending On etc which need to be handled correctly. That’s where a true ETL tool comes to play, with the intelligence to make this type of process automatic. Maybe next release of SSIS?
Well, in the meantime third party tools may be required to help us out. One such example is a Connection Manager from aminosoftware.com . It’s called Lysine and prevents you having to pre-process the data before it is pumped into an ASCII byte stream. Since SSIS is extensible, you can write custom components like this yourself. But why bother when someone else has been there, done that. Maybe Microsoft will snap it up and include it in SP1? We’ll see.