Handling special characters in reading and writing to CSV

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Handling special characters in reading and writing to CSV

venkata kirankumar
Hi All,


I have some data with different special characters, newline character, and
different language characters in a CSV file like `~!@#$%^&*|
()-_+={[}]|\:;""'<,>.?/
in data, while I am trying to read this CSV and trying to do calculations I
am not able to get this data as there in single cell. I found something
like RFC 4180 format can help to solve this problem.



If anyone can give suggestion related to handling these special characters
it will be help full for me



Thanks in advance,

D V Kiran Kumar

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Handling special characters in reading and writing to CSV

David Winsemius

On Feb 4, 2014, at 7:58 AM, Venkata Kirankumar wrote:

> Hi All,
>
>
> I have some data with different special characters, newline character, and
> different language characters in a CSV file like `~!@#$%^&*|
> ()-_+={[}]|\:;""'<,>.?/
> in data, while I am trying to read this CSV and trying to do calculations I
> am not able to get this data as there in single cell. I found something
> like RFC 4180 format can help to solve this problem.
>
>
>
> If anyone can give suggestion related to handling these special characters
> it will be help full for me
>

I'm having a difficult time understanding your expectations and thedata situation. If it's a "csv file",  then how can all three of <comma>, <single-quote>, and <double-quote> be properly distinguished when they are also part of the data?


You might consider using readLines (from base) or read.fwf (from the utils package)



>
>
> Thanks in advance,
>
> D V Kiran Kumar
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Handling special characters in reading and writing to CSV

venkata kirankumar
Hi David,


In CSV RFC 4180 format if any ' or " character is there then character will
go with escape character so CSV will distinguish properly.



I will try with read.fwf once because with redline I am facing same issue.

Thanks & Regards,
D V Kiran Kumar.


On Wed, Feb 5, 2014 at 3:14 AM, David Winsemius <[hidden email]>wrote:

>
> On Feb 4, 2014, at 7:58 AM, Venkata Kirankumar wrote:
>
> > Hi All,
> >
> >
> > I have some data with different special characters, newline character,
> and
> > different language characters in a CSV file like `~!@#$%^&*|
> > ()-_+={[}]|\:;""'<,>.?/
> > in data, while I am trying to read this CSV and trying to do
> calculations I
> > am not able to get this data as there in single cell. I found something
> > like RFC 4180 format can help to solve this problem.
> >
> >
> >
> > If anyone can give suggestion related to handling these special
> characters
> > it will be help full for me
> >
>
> I'm having a difficult time understanding your expectations and thedata
> situation. If it's a "csv file",  then how can all three of <comma>,
> <single-quote>, and <double-quote> be properly distinguished when they are
> also part of the data?
>
>
> You might consider using readLines (from base) or read.fwf (from the utils
> package)
>
>
>
> >
> >
> > Thanks in advance,
> >
> > D V Kiran Kumar
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Handling special characters in reading and writing to CSV

Ista Zahn
Hi Kiran,

Please post a reproducible example, either by pasting a sample of
comma separated values into you message, posting a .csv file somewhere
where we can download it. Without an example all we can do is guess
what your problem might be.

Best,
Ista

On Wed, Feb 5, 2014 at 5:10 AM, Venkata Kirankumar
<[hidden email]> wrote:

> Hi David,
>
>
> In CSV RFC 4180 format if any ' or " character is there then character will
> go with escape character so CSV will distinguish properly.
>
>
>
> I will try with read.fwf once because with redline I am facing same issue.
>
> Thanks & Regards,
> D V Kiran Kumar.
>
>
> On Wed, Feb 5, 2014 at 3:14 AM, David Winsemius <[hidden email]>wrote:
>
>>
>> On Feb 4, 2014, at 7:58 AM, Venkata Kirankumar wrote:
>>
>> > Hi All,
>> >
>> >
>> > I have some data with different special characters, newline character,
>> and
>> > different language characters in a CSV file like `~!@#$%^&*|
>> > ()-_+={[}]|\:;""'<,>.?/
>> > in data, while I am trying to read this CSV and trying to do
>> calculations I
>> > am not able to get this data as there in single cell. I found something
>> > like RFC 4180 format can help to solve this problem.
>> >
>> >
>> >
>> > If anyone can give suggestion related to handling these special
>> characters
>> > it will be help full for me
>> >
>>
>> I'm having a difficult time understanding your expectations and thedata
>> situation. If it's a "csv file",  then how can all three of <comma>,
>> <single-quote>, and <double-quote> be properly distinguished when they are
>> also part of the data?
>>
>>
>> You might consider using readLines (from base) or read.fwf (from the utils
>> package)
>>
>>
>>
>> >
>> >
>> > Thanks in advance,
>> >
>> > D V Kiran Kumar
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Handling special characters in reading and writing to CSV

David Winsemius
In reply to this post by venkata kirankumar

On Feb 5, 2014, at 2:10 AM, Venkata Kirankumar wrote:

> Hi David,
>  
> In CSV RFC 4180 format if any ‘ or “ character is there then character will go with escape character so CSV will distinguish properly.
>  
> I will try with read.fwf once because with redline I am facing same issue.
>  
> Thanks & Regards,
> D V Kiran Kumar.

I said 'readLines' not 'redline'.

It appears to me that read.table (which is a wrapper to scan currently read files that conform to htat standard, i.e. doubled quotes are treated as if they were escaped.

> read.table(text= '"aaa","b""bb","ccc"' ,sep=",")
   V1   V2  V3
1 aaa b"bb ccc

> read.table(text= "'aaa','b''bb','ccc'" ,sep=",")
   V1   V2  V3
1 aaa b'bb ccc


--
David.

>
> On Wed, Feb 5, 2014 at 3:14 AM, David Winsemius <[hidden email]> wrote:
>
> On Feb 4, 2014, at 7:58 AM, Venkata Kirankumar wrote:
>
> > Hi All,
> >
> >
> > I have some data with different special characters, newline character, and
> > different language characters in a CSV file like `~!@#$%^&*|
> > ()-_+={[}]|\:;""'<,>.?/
> > in data, while I am trying to read this CSV and trying to do calculations I
> > am not able to get this data as there in single cell. I found something
> > like RFC 4180 format can help to solve this problem.
> >
> >
> >
> > If anyone can give suggestion related to handling these special characters
> > it will be help full for me
> >
>
> I'm having a difficult time understanding your expectations and thedata situation. If it's a "csv file",  then how can all three of <comma>, <single-quote>, and <double-quote> be properly distinguished when they are also part of the data?
>
>
> You might consider using readLines (from base) or read.fwf (from the utils package)
>
>
>
> >
> >
> > Thanks in advance,
> >
> > D V Kiran Kumar
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Handling special characters in reading and writing to CSV

venkata kirankumar
In reply to this post by Ista Zahn
Dear Ista,
I copied my data below

UNIQUEID,FINDINGSID,ORGNUMRES,STNUMRES,CONVRES,VISITDY,ORGCHARRES,STCHARRES,NOMINALDAY,NOMINALDATE,MEASRMTDAY,MEASRMTDATE,INPUTDATE,NEOPLASMNAME,TUMORCLASSNAME,CATDOMAIN,CATDID,SPECIMENTYP,SPTDID,PCDOMAIN,USUBJID,PCDID,TESTDOMAIN,TSTDID,ORRESUNIT,RESDID,SUBJECTSID,STDRESUNIT,STDRDID,CONVRESUNIT,COVRDID,CUSTOMFIELD5,GRPLABEL,GRPNUMBER,SEX,SEXDID,TRIALGROUPSID,SPECIMENLOC,SPECIMENCOND,SPECIMENCOND1,SPECIMENCOND2,SPECIMENCOND3,SEVERITY,COMM,ASPECT,CAUSEOFDEATH,DERIVEFLG,PHASENAME,PHASENAMEDID,ENTITY,ENTITYDID,SECONDARYFLAG,CUSTOMFIELD0,CUSTOMFIELD4,CUSTOMFIELD6,CUSTOMFIELD9,SOURCE,RESCATEGORY,OFSPSEX,OFFSPNUM,FILEID,ANALYTEID,ANALYTEDID,DATETIME,ELTM,ENDY,NOMDAYOFPHASE,OFSPSEXDID,PLTIMEPOINT,STATUSFLAG,ANABIOREGION,TESTMETHOD,FINDLOC,CUSTOMFIELD8,TIMESLOTDESC,TIMESLOTCODE,PTPTN,TPTNUM
3073004,3073004,,37.800000000000000,37.800000000000000,61,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢
₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥ ៛,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢
₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥
៛,61,,,,,,,,,,,BW,"apcu102881`~!@#$%^&*()-_+={[}]|:;""'<,>.?/",16082,© ® ™
℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥ ៛,,© ®
™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥
៛,45741,38733,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹
₨ ₪ ₸ ₮ ₩ ¥ ៛,,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £
₹ ₨ ₪ ₸ ₮ ₩ ¥
៛,,"USB102881`~!@#$%^&*()-_+={[}]|:;""'<,>.?/","SC51`~!@#$%^&*()-_+={[}]|:;""'<,>.?/:
SET 1€ é í ñ ó ú ü ¿ á é í ó ú ü
ñ","SC51`~!@#$%^&*()-_+={[}]|:;""'<,>.?/",M,42133,1445,,,,,,,,,,,,,,,,,,,,,,,,2631,,,,,,,,,A,,,,,,,©
® ™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥ ៛,

Thanks & Regards,
D V Kiran Kumar


On Wed, Feb 5, 2014 at 5:05 PM, Ista Zahn <[hidden email]> wrote:

> Hi Kiran,
>
> Please post a reproducible example, either by pasting a sample of
> comma separated values into you message, posting a .csv file somewhere
> where we can download it. Without an example all we can do is guess
> what your problem might be.
>
> Best,
> Ista
>
> On Wed, Feb 5, 2014 at 5:10 AM, Venkata Kirankumar
> <[hidden email]> wrote:
> > Hi David,
> >
> >
> > In CSV RFC 4180 format if any ' or " character is there then character
> will
> > go with escape character so CSV will distinguish properly.
> >
> >
> >
> > I will try with read.fwf once because with redline I am facing same
> issue.
> >
> > Thanks & Regards,
> > D V Kiran Kumar.
> >
> >
> > On Wed, Feb 5, 2014 at 3:14 AM, David Winsemius <[hidden email]
> >wrote:
> >
> >>
> >> On Feb 4, 2014, at 7:58 AM, Venkata Kirankumar wrote:
> >>
> >> > Hi All,
> >> >
> >> >
> >> > I have some data with different special characters, newline character,
> >> and
> >> > different language characters in a CSV file like `~!@#$%^&*|
> >> > ()-_+={[}]|\:;""'<,>.?/
> >> > in data, while I am trying to read this CSV and trying to do
> >> calculations I
> >> > am not able to get this data as there in single cell. I found
> something
> >> > like RFC 4180 format can help to solve this problem.
> >> >
> >> >
> >> >
> >> > If anyone can give suggestion related to handling these special
> >> characters
> >> > it will be help full for me
> >> >
> >>
> >> I'm having a difficult time understanding your expectations and thedata
> >> situation. If it's a "csv file",  then how can all three of <comma>,
> >> <single-quote>, and <double-quote> be properly distinguished when they
> are
> >> also part of the data?
> >>
> >>
> >> You might consider using readLines (from base) or read.fwf (from the
> utils
> >> package)
> >>
> >>
> >>
> >> >
> >> >
> >> > Thanks in advance,
> >> >
> >> > D V Kiran Kumar
> >> >
> >> >       [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >> David Winsemius
> >> Alameda, CA, USA
> >>
> >>
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Handling special characters in reading and writing to CSV

Ista Zahn
Hi Venkata,

That example reads into R fine for me. I copied and saved it as
tmp.csv and simply read it in with

dat <- read.csv("tmp.csv")

which gave me a data.frame with one row and 78 columns as expected.
This worked in three different environments (linux, mac, windows), and
with different versions of R. Does it not work for you? If not please
post the results of running sessionInfo() so we can see what version
of R etc. you are using.

Best,
Ista

On Thu, Feb 6, 2014 at 3:34 AM, Venkata Kirankumar
<[hidden email]> wrote:

> Dear Ista,
> I copied my data below
>
> UNIQUEID,FINDINGSID,ORGNUMRES,STNUMRES,CONVRES,VISITDY,ORGCHARRES,STCHARRES,NOMINALDAY,NOMINALDATE,MEASRMTDAY,MEASRMTDATE,INPUTDATE,NEOPLASMNAME,TUMORCLASSNAME,CATDOMAIN,CATDID,SPECIMENTYP,SPTDID,PCDOMAIN,USUBJID,PCDID,TESTDOMAIN,TSTDID,ORRESUNIT,RESDID,SUBJECTSID,STDRESUNIT,STDRDID,CONVRESUNIT,COVRDID,CUSTOMFIELD5,GRPLABEL,GRPNUMBER,SEX,SEXDID,TRIALGROUPSID,SPECIMENLOC,SPECIMENCOND,SPECIMENCOND1,SPECIMENCOND2,SPECIMENCOND3,SEVERITY,COMM,ASPECT,CAUSEOFDEATH,DERIVEFLG,PHASENAME,PHASENAMEDID,ENTITY,ENTITYDID,SECONDARYFLAG,CUSTOMFIELD0,CUSTOMFIELD4,CUSTOMFIELD6,CUSTOMFIELD9,SOURCE,RESCATEGORY,OFSPSEX,OFFSPNUM,FILEID,ANALYTEID,ANALYTEDID,DATETIME,ELTM,ENDY,NOMDAYOFPHASE,OFSPSEXDID,PLTIMEPOINT,STATUSFLAG,ANABIOREGION,TESTMETHOD,FINDLOC,CUSTOMFIELD8,TIMESLOTDESC,TIMESLOTCODE,PTPTN,TPTNUM
> 3073004,3073004,,37.800000000000000,37.800000000000000,61,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢
> ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥ ៛,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢
> ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥
> ៛,61,,,,,,,,,,,BW,"apcu102881`~!@#$%^&*()-_+={[}]|:;""'<,>.?/",16082,© ® ™ ℠
> ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥ ៛,,© ® ™ ℠
> ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥
> ៛,45741,38733,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹
> ₨ ₪ ₸ ₮ ₩ ¥ ៛,,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹
> ₨ ₪ ₸ ₮ ₩ ¥
> ៛,,"USB102881`~!@#$%^&*()-_+={[}]|:;""'<,>.?/","SC51`~!@#$%^&*()-_+={[}]|:;""'<,>.?/:
> SET 1€ é í ñ ó ú ü ¿ á é í ó ú ü
> ñ","SC51`~!@#$%^&*()-_+={[}]|:;""'<,>.?/",M,42133,1445,,,,,,,,,,,,,,,,,,,,,,,,2631,,,,,,,,,A,,,,,,,©
> ® ™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥ ៛,
>
> Thanks & Regards,
> D V Kiran Kumar
>
>
> On Wed, Feb 5, 2014 at 5:05 PM, Ista Zahn <[hidden email]> wrote:
>>
>> Hi Kiran,
>>
>> Please post a reproducible example, either by pasting a sample of
>> comma separated values into you message, posting a .csv file somewhere
>> where we can download it. Without an example all we can do is guess
>> what your problem might be.
>>
>> Best,
>> Ista
>>
>> On Wed, Feb 5, 2014 at 5:10 AM, Venkata Kirankumar
>> <[hidden email]> wrote:
>> > Hi David,
>> >
>> >
>> > In CSV RFC 4180 format if any ' or " character is there then character
>> > will
>> > go with escape character so CSV will distinguish properly.
>> >
>> >
>> >
>> > I will try with read.fwf once because with redline I am facing same
>> > issue.
>> >
>> > Thanks & Regards,
>> > D V Kiran Kumar.
>> >
>> >
>> > On Wed, Feb 5, 2014 at 3:14 AM, David Winsemius
>> > <[hidden email]>wrote:
>> >
>> >>
>> >> On Feb 4, 2014, at 7:58 AM, Venkata Kirankumar wrote:
>> >>
>> >> > Hi All,
>> >> >
>> >> >
>> >> > I have some data with different special characters, newline
>> >> > character,
>> >> and
>> >> > different language characters in a CSV file like `~!@#$%^&*|
>> >> > ()-_+={[}]|\:;""'<,>.?/
>> >> > in data, while I am trying to read this CSV and trying to do
>> >> calculations I
>> >> > am not able to get this data as there in single cell. I found
>> >> > something
>> >> > like RFC 4180 format can help to solve this problem.
>> >> >
>> >> >
>> >> >
>> >> > If anyone can give suggestion related to handling these special
>> >> characters
>> >> > it will be help full for me
>> >> >
>> >>
>> >> I'm having a difficult time understanding your expectations and thedata
>> >> situation. If it's a "csv file",  then how can all three of <comma>,
>> >> <single-quote>, and <double-quote> be properly distinguished when they
>> >> are
>> >> also part of the data?
>> >>
>> >>
>> >> You might consider using readLines (from base) or read.fwf (from the
>> >> utils
>> >> package)
>> >>
>> >>
>> >>
>> >> >
>> >> >
>> >> > Thanks in advance,
>> >> >
>> >> > D V Kiran Kumar
>> >> >
>> >> >       [[alternative HTML version deleted]]
>> >> >
>> >> > ______________________________________________
>> >> > [hidden email] mailing list
>> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> > PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> > and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >> David Winsemius
>> >> Alameda, CA, USA
>> >>
>> >>
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Handling special characters in reading and writing to CSV

venkata kirankumar
Hi Ista,
I tried its working fine for me.

thank you

Regards,
D V Kiran Kumar


On Thu, Feb 6, 2014 at 11:13 PM, Ista Zahn <[hidden email]> wrote:

> Hi Venkata,
>
> That example reads into R fine for me. I copied and saved it as
> tmp.csv and simply read it in with
>
> dat <- read.csv("tmp.csv")
>
> which gave me a data.frame with one row and 78 columns as expected.
> This worked in three different environments (linux, mac, windows), and
> with different versions of R. Does it not work for you? If not please
> post the results of running sessionInfo() so we can see what version
> of R etc. you are using.
>
> Best,
> Ista
>
> On Thu, Feb 6, 2014 at 3:34 AM, Venkata Kirankumar
> <[hidden email]> wrote:
> > Dear Ista,
> > I copied my data below
> >
> >
> UNIQUEID,FINDINGSID,ORGNUMRES,STNUMRES,CONVRES,VISITDY,ORGCHARRES,STCHARRES,NOMINALDAY,NOMINALDATE,MEASRMTDAY,MEASRMTDATE,INPUTDATE,NEOPLASMNAME,TUMORCLASSNAME,CATDOMAIN,CATDID,SPECIMENTYP,SPTDID,PCDOMAIN,USUBJID,PCDID,TESTDOMAIN,TSTDID,ORRESUNIT,RESDID,SUBJECTSID,STDRESUNIT,STDRDID,CONVRESUNIT,COVRDID,CUSTOMFIELD5,GRPLABEL,GRPNUMBER,SEX,SEXDID,TRIALGROUPSID,SPECIMENLOC,SPECIMENCOND,SPECIMENCOND1,SPECIMENCOND2,SPECIMENCOND3,SEVERITY,COMM,ASPECT,CAUSEOFDEATH,DERIVEFLG,PHASENAME,PHASENAMEDID,ENTITY,ENTITYDID,SECONDARYFLAG,CUSTOMFIELD0,CUSTOMFIELD4,CUSTOMFIELD6,CUSTOMFIELD9,SOURCE,RESCATEGORY,OFSPSEX,OFFSPNUM,FILEID,ANALYTEID,ANALYTEDID,DATETIME,ELTM,ENDY,NOMDAYOFPHASE,OFSPSEXDID,PLTIMEPOINT,STATUSFLAG,ANABIOREGION,TESTMETHOD,FINDLOC,CUSTOMFIELD8,TIMESLOTDESC,TIMESLOTCODE,PTPTN,TPTNUM
> > 3073004,3073004,,37.800000000000000,37.800000000000000,61,© ® ™ ℠ ℗ ₳ ฿
> ₵ ¢
> > ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥ ៛,© ® ™ ℠ ℗ ₳ ฿
> ₵ ¢
> > ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥
> > ៛,61,,,,,,,,,,,BW,"apcu102881`~!@#$%^&*()-_+={[}]|:;""'<,>.?/",16082,© ®
> ™ ℠
> > ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥ ៛,,© ®
> ™ ℠
> > ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥
> > ៛,45741,38733,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰
> £ ₹
> > ₨ ₪ ₸ ₮ ₩ ¥ ៛,,© ® ™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰
> £ ₹
> > ₨ ₪ ₸ ₮ ₩ ¥
> >
> ៛,,"USB102881`~!@#$%^&*()-_+={[}]|:;""'<,>.?/","SC51`~!@#$%^&*()-_+={[}]|:;""'<,>.?/:
> > SET 1€ é í ñ ó ú ü ¿ á é í ó ú ü
> >
> ñ","SC51`~!@#$%^&*()-_+={[}]|:;""'<,>.?/",M,42133,1445,,,,,,,,,,,,,,,,,,,,,,,,2631,,,,,,,,,A,,,,,,,©
> > ® ™ ℠ ℗ ₳ ฿ ₵ ¢ ₡ ₢ ₠ $ ₫ ৳ ₯ € ƒ ₣ ₲ ₴ ₭ ₺ ℳ ₥ ₦ ₧ ₱ ₰ £ ₹ ₨ ₪ ₸ ₮ ₩ ¥
> ៛,
> >
> > Thanks & Regards,
> > D V Kiran Kumar
> >
> >
> > On Wed, Feb 5, 2014 at 5:05 PM, Ista Zahn <[hidden email]> wrote:
> >>
> >> Hi Kiran,
> >>
> >> Please post a reproducible example, either by pasting a sample of
> >> comma separated values into you message, posting a .csv file somewhere
> >> where we can download it. Without an example all we can do is guess
> >> what your problem might be.
> >>
> >> Best,
> >> Ista
> >>
> >> On Wed, Feb 5, 2014 at 5:10 AM, Venkata Kirankumar
> >> <[hidden email]> wrote:
> >> > Hi David,
> >> >
> >> >
> >> > In CSV RFC 4180 format if any ' or " character is there then character
> >> > will
> >> > go with escape character so CSV will distinguish properly.
> >> >
> >> >
> >> >
> >> > I will try with read.fwf once because with redline I am facing same
> >> > issue.
> >> >
> >> > Thanks & Regards,
> >> > D V Kiran Kumar.
> >> >
> >> >
> >> > On Wed, Feb 5, 2014 at 3:14 AM, David Winsemius
> >> > <[hidden email]>wrote:
> >> >
> >> >>
> >> >> On Feb 4, 2014, at 7:58 AM, Venkata Kirankumar wrote:
> >> >>
> >> >> > Hi All,
> >> >> >
> >> >> >
> >> >> > I have some data with different special characters, newline
> >> >> > character,
> >> >> and
> >> >> > different language characters in a CSV file like `~!@#$%^&*|
> >> >> > ()-_+={[}]|\:;""'<,>.?/
> >> >> > in data, while I am trying to read this CSV and trying to do
> >> >> calculations I
> >> >> > am not able to get this data as there in single cell. I found
> >> >> > something
> >> >> > like RFC 4180 format can help to solve this problem.
> >> >> >
> >> >> >
> >> >> >
> >> >> > If anyone can give suggestion related to handling these special
> >> >> characters
> >> >> > it will be help full for me
> >> >> >
> >> >>
> >> >> I'm having a difficult time understanding your expectations and
> thedata
> >> >> situation. If it's a "csv file",  then how can all three of <comma>,
> >> >> <single-quote>, and <double-quote> be properly distinguished when
> they
> >> >> are
> >> >> also part of the data?
> >> >>
> >> >>
> >> >> You might consider using readLines (from base) or read.fwf (from the
> >> >> utils
> >> >> package)
> >> >>
> >> >>
> >> >>
> >> >> >
> >> >> >
> >> >> > Thanks in advance,
> >> >> >
> >> >> > D V Kiran Kumar
> >> >> >
> >> >> >       [[alternative HTML version deleted]]
> >> >> >
> >> >> > ______________________________________________
> >> >> > [hidden email] mailing list
> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> > PLEASE do read the posting guide
> >> >> http://www.R-project.org/posting-guide.html
> >> >> > and provide commented, minimal, self-contained, reproducible code.
> >> >>
> >> >> David Winsemius
> >> >> Alameda, CA, USA
> >> >>
> >> >>
> >> >
> >> >         [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.