Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I have a problem using data from a tab delimited data file imported with
read.delim
.
Most of the columns contain numerical data which I need to do a
t.test
for. Unfortunately I always get this error:
Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my)))
stop("data are essentiallyconstant") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(x) : argument is not numeric or logical: returning NA
2: In mean.default(y) : argument is not numeric or logical: returning NA
I noticed that this only happens with vectors that consist of different levels.
It won't even perform simple numerical operations like vector[1] + vector[2] for leveled vectors.
Vectors without levels work fine, though.
How can I use the data in the leveled vectors for calculation?
Thank you
–
I have been able to reproduce your error message with the following small example:
x = as.factor(1:5)
y = as.factor(1:5)
t.test(x, y)
yields
Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(x) : argument is not numeric or logical: returning NA
2: In mean.default(y) : argument is not numeric or logical: returning NA
The problem is you are trying to perform a t-test on non-numeric vectors. Addition likewise is not defined for factors:
x + y
yields
[1] NA NA NA NA NA
Warning message:
In Ops.factor(x, y) : + not meaningful for factors
The warning gives keen insight as to what is amiss and also explains why your t-test is not working.
To fix the problem, you need to do as ilya suggests: convert your vectors to numeric with as.numeric(as.character())
–
–
–
–
You say "Most of the columns contain numerical data". That's the problem. Only when all columns contain numerical data, can the function apply used without changing the data type. If there is any non-numerical data in other columns, you should change the data type in the function apply:
pvalue<-apply(x,1,ttest<-function(tmp {
if(length(unique(c(tmp[5],tmp[7],tmp[9])))!=1 &&
length(unique(c(tmp[11],tmp[13],tmp[15])))!=1)
t.test(c(as.numeric(tmp[5]),as.numeric(tmp[7]),
as.numeric(tmp[9])), c(as.numeric(tmp[11]),
as.numeric(tmp[13]),as.numeric(tmp[15])))$p.value
else NA})
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.