Spark SQL case insensitive filter for column conditions

link之家

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

val df = sc.parallelize(Seq(
  (1L, "Fortinet"), (2L, "foRtinet"), (3L, "foo")
)).toDF("k", "v")
df.where($"v".rlike("(?i)^fortinet$")).show
// +---+--------+
// |  k|       v|
// +---+--------+
// |  1|Fortinet|
// |  2|foRtinet|
// +---+--------+
or simple equality with lower / upper:
import org.apache.spark.sql.functions.{lower, upper}
df.where(lower($"v") === "fortinet")
// +---+--------+
// |  k|       v|
// +---+--------+
// |  1|Fortinet|
// |  2|foRtinet|
// +---+--------+
df.where(upper($"v") === "FORTINET")
// +---+--------+
// |  k|       v|
// +---+--------+
// |  1|Fortinet|
// |  2|foRtinet|
// +---+--------+
For simple filters I would prefer rlike although performance should be similar, for join conditions equality is a much better choice. See How can we JOIN two Spark SQL dataframes using a SQL-esque "LIKE" criterion? for details.
                @zero3232  I have this problem with all table. I mean I need that my application provides case insensitive result. is there any solution with which i can get SQLServer like results (where it ignores case everytime) ?
– Parth Vishvajit
                Nov 28, 2017 at 11:51
                For future viewers confused by the (?i) as I was, that is the syntax for Scala regex flags (case-insensitivity in this instance).
– Excel Help
                Feb 2, 2021 at 16:29
Try to use lower/upper string functions:  
dataFrame.filter(lower(dataFrame.col("vendor")).equalTo("fortinet"))
dataFrame.filter(upper(dataFrame.col("vendor")).equalTo("FORTINET"))
Another alternative which saves a couple of sets of parenthesis:
import pyspark.sql.functions as f
df.filter(f.upper("vendor") == "FORTINET)
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.