添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
id int, info map comment 'map<姓名,性别>', favorite array comment 'array[football,basketball]' row format delimited fields terminated by '\t' --列分隔符 collection items terminated by ',' --array中各个item之间的分隔符 map keys terminated by ':' --map中key和value之间的分隔符 lines terminated by '\n'; --行分隔符

2.3 导入数据

load data local inpath '/opt/module/jobs/student.txt' into table studentInfo;

2.3 执行查询

select *  from studentInfo;
+-----------------+---------------------+----------------------------+--+
| studentinfo.id  |  studentinfo.info   |    studentinfo.favorite    |
+-----------------+---------------------+----------------------------+--+
| 1               | {"zhangsan":"man"}  | ["football","basketball"]  |
| 2               | {"lisi":"female"}   | ["sing","dance"]           |
+-----------------+---------------------+----------------------------+--+
-- 对于map查询,map[key]
--对于array查询,array[index]
select id, info['zhangsan'],favorite[1] from studentInfo;
+-----+-------+-------------+--+
| id  |  sex  |  favorite   |
+-----+-------+-------------+--+
| 1   | man   | basketball  |
| 2   | NULL  | dance       |
+-----+-------+-------------+--+

3. Impala 使用复杂类型

注意:Impala 只用parquet格式存储时,才能使用复杂数据类型

3.1 Hive中建表(parquet格式,导入数据

create table student_parquet(
    id int,
    info map<string,string>  comment 'map<姓名,性别>',
    favorite array<string> comment 'array[football,basketball]'
stored as parquet
insert overwrite table student_parquet select id,info,favorite from studentInfo;

3.2 刷新impala元数据

refresh default.student_parquet;

3.3 执行查询

select 
    id ,favorite_array.item,info_map.key,info_map.value
from student_parquet,
    student_parquet.info as info_map,
    student_parquet.favorite as favorite_array;
+----+------------+----------+--------+
| id | item       | key      | value  |
+----+------------+----------+--------+
| 1  | football   | zhangsan | man    |
| 1  | basketball | zhangsan | man    |
| 2  | sing       | lisi     | female |
| 2  | dance      | lisi     | female |
+----+------------+----------+--------+
from student_parquet,
    student_parquet.info as info_map,
    student_parquet.favorite as favorite_array
where favorite_array.POS = 0;
+----+----------+
| id | item     |
+----+----------+
| 1  | football |
| 2  | sing     |
+----+----------+
    id ,favorite_array.item,info_map.value
from student_parquet,
    student_parquet.info as info_map,
    student_parquet.favorite as favorite_array
where favorite_array.item = 'sing'
and info_map.key = 'lisi';
+----+------+--------+
| id | item | value  |
+----+------+--------+
| 2  | sing | female |
+----+------+--------+

array 类型视为 一张表, 其列名为 item

map类型有两个列, 一个是key, 一个是value

  • https://blog.csdn.net/rav009/article/details/86750850
  • https://docs.cloudera.com/documentation/enterprise/5-5-x/topics/impala_complex_types.html
  •