Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
434 views
in Technique[技术] by (71.8m points)

Create a table from one line CSV data on SAS

I try to import data from a csv with only one line data formatted like this :

CAS$#$#$LLT_CODE$#$#$PT_CODE$#$#$HLT_CODE$#$#$HLGT_CODE$#$#$SOC_CODE$#$#$LLT$#$#$PT$#$#$HLT$#$#$HLGT$#$#$SOC$#$#$SOC_ABB#$#$#DJ20210005-0$#$#$10001896$#$#$10012271$#$#$10001897$#$#$10057167$#$#$10029205$#$#$Maladie d'Alzheimer$#$#$Démence de type Alzheimer$#$#$Maladie d'Alzheimer (incl sous-types)$#$#$Déficiences mentales$#$#$Affections du système nerveux$#$#$Nerv#$#$#DJ20210005-0$#$#$10019308$#$#$10003664$#$#$10007607$#$#$10007510$#$#$10010331$#$#$Communication interauriculaire$#$#$Communication interauriculaire$#$#$Défauts congénitaux du septum cardiaque$#$#$Troubles congénitaux cardiovasculaires$#$#$Affections congénitales, familiales et génétiques$#$#$Cong#$#$#

"#$#$#" determine end of line and "$#$#$" separe columns.

How can i do to import it ?

Here's my code :

data a; infile "C:/Users/Papa Yatma/Documents/My SAS Files/9.4/ATCD.txt" dlm="$" dsd ; input var1 $ var2 $ var3 $ var4 $ var5 $ var6 $ var7 $ var8 $ var9 $ var10 $ var11 $ var12 $ @@; run;

Thank you for your help.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

As long as the actual "records" are not too long I would use the DLMSTR= option to process the file twice. First to parse the "records" into lines. Then to read the fields from the lines.

So first make a new text file that has one line per record.

filename new temp;
data _null_;
  infile have recfm=n lrecl=1000000 dlmstr='#$#$#';
  file new ;
  input line :$32767. @;
  put line ;
run;

Now you can read the file NEW using the other delimiter string.

For example you could convert it to a real CSV file.

filename csv temp;
data _null_;
  infile new dlmstr='$#$#$' length=ll column=cc truncover ;
  file csv dsd ;
  do until(cc>=ll);
    input word :$32767. @ ;
    put word @;
  end;
  put;
run;

Results:

CAS,LLT_CODE,PT_CODE,HLT_CODE,HLGT_CODE,SOC_CODE,LLT,PT,HLT,HLGT,SOC,SOC_ABB
DJ20210005-0,10001896,10012271,10001897,10057167,10029205,Maladie d'Alzheimer,Démence de type Alzheimer,Maladie d'Alzheimer (incl sous-types),Déficiences mentales,Affections du système nerveux,Nerv
DJ20210005-0,10019308,10003664,10007607,10007510,10010331,Communication interauriculaire,Communication interauriculaire,Défauts congénitaux du septum cardiaque,Troubles congénitaux cardiovasculaires,"Affections congénitales, familiales et génétiques",Cong

This CSV file is then easy to read:

data test;
  infile csv dsd firstobs=2 truncover ;
  length CAS LLT_CODE PT_CODE HLT_CODE HLGT_CODE SOC_CODE LLT PT HLT HLGT SOC SOC_ABB $100;
  input CAS -- SOC_ABB;
run;

If it is possible any of the values might include end of line characters then you should add code to replace those in the first step. For example you might add this line to replace CRLF strings with pipe characters.

line = tranwrd(line,'0D0A'x,'|');

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...