ANALYZE TABLE

Description

The ANALYZE TABLE statement collects statistics about the table to be used by the query optimizer to find a better query execution plan.

Syntax

ANALYZE TABLE table_identifier [ partition_spec ]
    COMPUTE STATISTICS [ NOSCAN | FOR COLUMNS col [ , ... ] | FOR ALL COLUMNS ]

Parameters

table_identifier
Specifies a table name, which may be optionally qualified with a database name.

Syntax: [ database_name. ] table_name
partition_spec
An optional parameter that specifies a comma separated list of key and value pairs for partitions. When specified, partition statistics is returned.

Syntax: PARTITION ( partition_col_name [ = partition_col_val ] [ , ... ] )
[ NOSCAN | FOR COLUMNS col [ , ... ] | FOR ALL COLUMNS ]
  • If no analyze option is specified, ANALYZE TABLE collects the table's number of rows and size in bytes.
  • NOSCAN
    Collect only the table's size in bytes ( which does not require scanning the entire table ).
  • FOR COLUMNS col [ , ... ] | FOR ALL COLUMNS
    Collect column statistics for each column specified, or alternatively for every column, as well as table statistics.

Examples

 ANALYZE TABLE students COMPUTE STATISTICS NOSCAN;

 DESC EXTENDED students;
     ......
     Statistics	2820 bytes
     ......

 ANALYZE TABLE students COMPUTE STATISTICS;

 DESC EXTENDED students;
     ......
     Statistics	2820 bytes, 3 rows
     ......

 ANALYZE TABLE students PARTITION (student_id = 111111) COMPUTE STATISTICS;

 DESC EXTENDED students PARTITION (student_id = 111111);
     ......
     Partition Statistics	919 bytes, 1 rows
     ......

 ANALYZE TABLE students COMPUTE STATISTICS FOR COLUMNS name;

 DESC EXTENDED students name;
     =default tbl=students
     col_name	name
     data_type	string
     comment	NULL
     min	NULL
     max	NULL
     num_nulls	0
     distinct_count	3
     avg_col_len	11
     max_col_len	13
     histogram	NULL