Thu, 08 May 2008

Downloading a large database from iSeries in csv format.

Today, my task is to receive from the iSeries a huge table (500 Mb in the CSV format). Although I have a so-called ADSL connection, downloading this file through Client Access would take hours (about 8 hours I presume).

The solution is to compress the file before transfer, then decompress it on Windows. But, I didn't know how to compress data on the iSeries, in a format that can be decompressed on Windows. After searching, I finally found that I wouldn't need to download or install anything to the iSeries. Hope this post can help someone.

Creating the csv file on the iSeries

Everyone knows how to do this : It's the command CPYTOIMPF which I've used to create a CSV format of the database, directly to the iSeries' IFS. This is how I did it :

CPYTOIMPF 
  FROMFILE(MYLIB/MYFILE) 
  TOSTMF('/home/mycsvfile.csv') 
  MBROPT(*REPLACE)
  STMFCODPAG(*STMF)
  RCDDLM(*CRLF)
  FLDDLM(';')

CCSID conversion

Before we will compress the text file, it is really important to convert the file to have a CCSID that your Windows program could read. Have in mind that, when you will compress the text file, the compressed file you get is somewhat another binary file. If you download a text file by ftp, your transfer will be in text mode, and an implicit conversion may occur, so you don't have to bother about character set conversion. But while the ftp client will see the compressed file as binary, it cannot do the implicit conversion anymore. You will be able to download the compressed file and decompress it, but the decompressed file may be in an EBCDIC encoding.

What you need to know is the respective CCSIDs of your file in the iSeries and of the decompressed file on Windows. On iSeries, it is generally your job's CCSID. If you don't know, you could use the command WRKLNK '/home/mycsvfile.csv', then use option 8 to display it. On Windows, you may wish to use the iso8859-1 encoding, which have a CCSID value of 819. The other commonly used encoding is UTF-8 Unicode, which has a CCSID value of 1208.

A simple command to perform the conversion is CPY.

CPY
 OBJ('/home/mycsvfile.csv')
 TOOBJ('/home/mycsvfile.pcformat.csv')
 FROMCCSID(37)
 TOCCSID(819)
 DTAFMT(*TEXT)

Compressing the csv file

To compress the text file, we use the jar tool in QSHELL. If you didn't never use QSHELL (this was the first time for me), QSHELL is like the console on Windows® or Unix. To start it, just type STRQSH. You'll get a screen like this :

                                                         QSH Command Entry                                                          
                                                                                                                                    
   $                                                                                                                                
                                                                                                                                    
                                                                                                                                    
                                                                                                                                    
                                                                                                                                    
 ===>                                                                                                                   
                                                                                                                        
                                                                                                                        
                                                                                                                        
 F3=Exit   F6=Print F9=Retrieve F12=Disconnect                                                                                      
 F13=Clear F17=Top  F18=Bottom  F21=CL command entry                                                                                

All we need is then to go to the IFS folder where we put our text file, and launch the jar command there.

cd /home
jar cvfM mycsvfile.jar mycsvfile.pcformat.csv

When compressing is finished, you can type F3 to exit the QSHELL environment.

The compression ratio is generally above 90% because it's just text file, and it is rarely below 75%. I was able to reduce the size of my download from 500 to 24 M Mb. On Windows®, I use the opensource software 7-zip to uncompress the jar file.

Execute the whole process in a batch

The whole process can be done in batch. QShell can perform tasks in batch, as long as those tasks don't need any input from the user.

Previously, we invoked the QShell "console" without parameters to begin an interactive session. Now, in order to perform any tasks in batch, we supply a parameter CMD which is a string representing a semicolon separated QShell commands.

But, a precaution to take is to remove the v modifier in the jar command. So, we will change cvfM to cfM. The v standed for verbose : it allowed the interactive session to display useful informations about the compression process, but it also required an input from the user after the process has finished. To conclude, this is the CL command that create the jar file in batch :

QSH CMD('cd /home; jar cfM mycsvfile.jar mycsvfile.pcformat.csv')

Ahhhh, a little final note : QShell may have problems with filename containing special characters like $, @, ... Consider using filename consisting of letters from a to z, numbers and period only.

posté le: 07:11 | path: /data/as400 | permalien

Wed, 14 Nov 2007

SQL0326

One day, I've created a program in the SQLRPG language, with RPG400 (and not RPGIV). It was something like the snippet shown below, except that it had about two dozens of fields in the SELECT clause.

C/EXEC SQL 
C+ DECLARE C1 CURSOR FOR SELECT A, B, C FROM T
C/END-EXEC
 * ...
C/EXEC SQL
C+ FETCH NEXT FROM C1 INTO :A, :B, :C
C/END-EXEC

When compiling, I've got the strange SQL0326 error message. A little command DSPMSGD SQL0326 QSQLMSG shows me that it meant Too many host variables specified. I restart counting again and again, but I definitely had the same number of host variables as the number of fields in my SELECT clause. Googling didn't help, seems like I didn't have the right search keywords or no one before encountered this issue. I tried every figure I could imagine, blindly, and after one day of searching, I finally found the trick!

Some fields in the database had 300 characters length. In fact, they represented in one string an array of 6 50 characters sub-elements. In my program, I declared the variable as an array of 6 elements of 50 charcters long. I thought it as a clever way to get automagically the value of each element, but that was the problem.

In an RPG/400 program, a string variable can have a length less that 256 characters. This doesn't apply to arrays, so, even if the total length of the variable array was 300, the compiler didn't see it as a problem. But at the FETCH instruction, I think fields having a length larger that 256 characters weren't considered. The solution was to, either use SUBSTR in my SELECT clause to limit the length of the strings fetch to be under 256, or CVTRPGSRC to pass to RPGIV, which does not have this limitation.

C/EXEC SQL 
C+ DECLARE C1 CURSOR FOR SELECT 
C+ SUBSTR(A, 1, 250) AS A1, SUBSTR(A, 251, 50) AS A2, B, C FROM T
C/END-EXEC
 * ...
C/EXEC SQL
C+ FETCH NEXT FROM C1 INTO :A1, :A2, :B, :C
C/END-EXEC
posté le: 14:16 | path: /data/as400/sqlrpg | permalien
Septembre 2010
Dim Lun Mar Mer Jeu Ven Sam
     
   

Malagasy miray
About

DotMG's joblog

Work hard at whatever you do. (Ecc. 9. 10a. CEV)
Valid XHTML 1.0 Transitional   Valid CSS!   Powered by blosxom 2.0