Cloudera CCA159 — CCA Data Analyst Exam

  1. Dates
    A) Convert string to date:
    select to_date(from_unixtime(unix_timestamp(order_date),’yyyy-MM-dd hh:mm:ss’))) from orders limit 10;
    B) Add days to date
    select date_add(to_date(from_unixtime(unix_timestamp(order_date),’yyyy-MM-dd hh:mm:ss’)),1) from orders limit 10;
    C) Add time to date
    select from_unixtime(unix_timestamp(order_date) + 3600,’yyyy-MM-dd hh:mm:ss’) from orders limit 10;
  2. split: splits a line into arrays
    select split(line,’ ‘) word from files
  3. explode: explodes an array into rows
    select explode(split(line,’ ‘)) word from files
  4. LATERAL VIEW
    select r_regionkey, SUM(nations_details.n_nationkey), COUNT(nations_details.n_name) from region LATERAL VIEW explode(r_nations) exploded_table as nations_details GROUP BY r_regionkey;

    SELECT tf.* FROM (SELECT 0) t LATERAL VIEW EXPLODE(array(‘A’,’B’,’C’)) tf as col;
  5. concat
    select concat(‘a’,’b’,’c’);
  6. concat_ws
    select concat_ws(‘|’,’a’,’b’,’c’);
  7. collect_list
    select deptno, collect_list(deptid), max(sal) from dept group by deptno;
    + — — — — -+ — — — — + — — — +
    | deptno | _c1 | _c2 |
    + — — — — -+ — — — — + — — — +
    | 1 | [1,2] | 11 |
    + — — — — -+ — — — — + — — — +
  8. Collect_set
    Same as collect_list but removes the duplicates
  9. Generating report file in hive
    INSERT OVERWRITE DIRECTORY ‘/user/<<UserName>>/data/region/’ ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ STORED AS TEXTFILE SELECT * FROM region LIMIT 10;
  10. Table Stored as Avro
    CREATE TABLE kst (field1 string, field2 int)
    PARTITIONED BY (ds string)
    STORED AS AVRO;
    INSERT INTO TABLE kst PARTITION(ds=’A’) VALUES (‘B’,3);
    create table kst_2 STORED AS AVRO as select * from kst where 1=2;
    create table kst_3 STORED AS AVRO TBLPROPERTIES(“avro.compress”=”snappy”) as select * from kst where 1=2;
  11. ALTER TABLE
    alter table m change `_c1` cnt bigint;
  12. For transactional tables
    set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
    set hive.support.concurrency=true;
  13. Recover deleted files in hadoop
    hdfs dfs -ls /user/<<UserOrg>>/.Trash/Current/user/<<UserOrg>>/warehouse/test.db/dept1
    Move the file to correct directory and load

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store