Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I'm trying to execute the test query like this:
SELECT COUNT(CASE WHEN name IN (SELECT name FROM requiredProducts) THEN name END)
FROM myProducts
which throws the following exception:
java.lang.ClassCastException:
org.apache.spark.sql.execution.datasources.LogicalRelation cannot be cast to
org.apache.spark.sql.execution.SparkPlan
I have a suggestion that IN operator can not be used in CASE WHEN. Is it really so? Spark documentation is silent about this.
–
The IN operator using a subquery does not work in a projection regardless of whether it is contained in a CASE WHEN, it will only work in filters. It works fine if you specify values in the IN clause directly rather than using a subquery.
I am not sure how to generate the exact exception you got above, but when I attempt to run a similar query in Spark Scala, it returns a more descriptive error:
org.apache.spark.sql.AnalysisException: IN/EXISTS predicate sub-queries can only be used in a Filter: Project [CASE WHEN agi_label#5 IN (list#96 []) THEN 1 ELSE 0 END AS CASE WHEN (agi_label IN (listquery())) THEN 1 ELSE 0 END#97]
I have run into this issue in the past. Your best bet is probably to restructure it to use a left join to requiredProducts
and then check for a null in the case statement. For example, something like this might work:
SELECT COUNT(CASE WHEN rp.name is not null THEN mp.name END)
FROM myProducts mp
LEFT JOIN requiredProducts rp ON mp.name = rp.name
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.