Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Ask Question
I am trying to read in this delimited text file. It is marked as .csv but it says that it is UTF-16 Unicode Text.txt file. What am I doing wrong?
df <- read.delim("/Users/admin/Downloads/data1.csv", sep = ",")
Error in make.names(col.names, unique = TRUE) :
invalid multibyte string at '<ff><fe>
<ff><fe>'
warnings()
Warning messages:
1: In grep("^[^#].*", lines, value = TRUE) :
input string 1 is invalid in this locale
2: In read.table(path, encoding = encoding, header = header, ... :
line 1 appears to contain embedded nulls
–
–
invalid multibyte string at
Have a look at FileEconding
. In my case, I've sorted using: fileEncoding="latin1"
Here are the docs:
https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/Encoding
Make sure to input the correct Encoding.
I've never had this error before, maybe they've changed the default options of read function arguments.
Today i got the same error while reading a csv file with complex column names, solved using fileEncoding='latin1'
and also took the opportunity to add check.names = F
to avoid have spaces replaced by dots.
mydata<-read.csv('my_dataset.csv',fileEncoding='latin1',check.names=F)
mydata<-read.csv2('my_dataset.csv',fileEncoding='latin1',check.names=F)
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.