Problem
Can any hive dataset be partitioned?
When I partition a dataset on a field in the middle of the row, the data preview results in a "no data found" error. If I remove the PARTITIONED BY clause from the CREATE TABLE command, the data is visible.
Also, ODAS rejects the use of the same field name for the CREATE TABLE command and the PARTITIONED BY clause. If I change the field name or remove it from the CREATE TABLE, the query succeeds but I cannot preview the data as ODAS reports "no data found". I tried omitting the column name from the CREATE TABLE and only using it in the PARTITIONED BY clause, but then no data was visible.
Does this indicate that the data, as it is stored, is not amenable to partitioning on that column? Is this a ODAS or a Hive issue?
Solution
Create a partition table from scratch where the data is already stored by partition columns (in separate folders, disks, files, …). Partition columns are specified by the PARTITIONED BY clause. They do not appear in the main column list, but are like regular columns: queryable and usable in filters.
Comments
0 comments
Please sign in to leave a comment.