PostgreSQLでは継承を用いてパーティショニングを行えるが,それで検索性能がどうなるのかを調べてみる.パーティショニングの有無とCE(Constraint Exclusion)によるパーティションプルーニングの有無を変えて,調べてみます.

PostgreSQLのパーティショニングを試す

前回,ポスグレでインターバルパーティショニングもどきを試したが,性能には深入りしていなかった. 今回は,パーティショニングによって検索性能がどうなるか確認してみます.

テーブル定義

テーブルの論理的な構造は,ApacheのログをDBで解析するを参照.

これに対して,reqtime(リクエスト日時)をパーティションキーとした,6ヶ月ごとのインターバルパーティションもどきを実施している. パーティション数は23個で,レコードの合計件数は約1400万件.

  • パーティション親テーブル:p_apache_log
  • パーティション子テーブル:p_apache_log_1〜p_apache_log_23 (計23個)

検証用のSQL

SELECT
	count(*) AS nreq,
	max(reqtime) AS ltime,
	target AS t
FROM
	p_apache_log
WHERE
	vhost = 'kamoland.com' AND
	reqtime >= date_trunc('month', current_timestamp - interval '1 day') AND
	reqtime < date_trunc('month', current_timestamp - interval '1 day') + interval '1 month'
	AND
	statuscode IN ('200', '206', '304')
	AND
	-- FILEEXCLUDE
	NOT EXISTS (
		SELECT 1 FROM al_target_ex WHERE target LIKE ex_pattern
	)
	AND
	-- BROWEXCLUDE
	NOT EXISTS (
		SELECT 1 FROM al_useragent_ex WHERE useragent LIKE ex_pattern
	)
GROUP BY
	t
ORDER BY
	nreq DESC
LIMIT 100
何やらいろいろな条件が付いているが,重要なのは,
  • パーティション親テーブルである「p_apache_log」を検索していること
  • パーティションキーであるreqtimeに,以下のように「直近1ヶ月のみ」という条件が付いていること

reqtime >= date_trunc('month', current_timestamp - interval '1 day') AND
reqtime < date_trunc('month', current_timestamp - interval '1 day') + interval '1 month'
つまり,6ヶ月の境界をまたがない限りは,1個のパーティションだけを検索すべきとなる.

検証結果

パーティショニングしていない状態と比較するために,1テーブルに全データを叩き込んだテーブルも用意して, それぞれに対してSQLを流した結果を見る.

CEが効いていない状態

非パーティション
----- 実行計画 -----
Sort  (cost=1404892.20..1404892.67 rows=187 width=43) (actual time=62501.980..62502.053 rows=28 loops=1)
  Sort Key: (date_trunc('day'::text, apache_log.reqtime))
  Sort Method:  quicksort  Memory: 18kB
  ->  HashAggregate  (cost=1404879.53..1404885.14 rows=187 width=43) (actual time=62501.715..62501.828 rows=28 loops=1)
        ->  Bitmap Heap Scan on apache_log  (cost=17089.58..1404142.30 rows=73723 width=43) (actual time=2825.112..62045.935 rows=39446 loops=1)
              Recheck Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
              Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
              ->  Bitmap Index Scan on apache_log_p  (cost=0.00..17071.15 rows=339037 width=0) (actual time=2750.505..2750.505 rows=365057 loops=1)
                    Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
              SubPlan
                ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.027..0.027 rows=0 loops=55137)
                      Filter: ($1 ~~ ex_pattern)
                ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=351568)
                      Filter: ($0 ~~ ex_pattern)
Total runtime: 62502.511 ms

パーティション化

----- 実行計画 -----
Sort  (cost=1082047.00..1082047.50 rows=200 width=64) (actual time=15997.923..15997.993 rows=28 loops=1)
  Sort Key: (date_trunc('day'::text, public.p_apache_log.reqtime))
  Sort Method:  quicksort  Memory: 18kB
  ->  HashAggregate  (cost=1082033.36..1082039.36 rows=200 width=64) (actual time=15997.596..15997.714 rows=28 loops=1)
        ->  Result  (cost=0.00..1081215.26 rows=81810 width=64) (actual time=1868.087..15633.814 rows=39526 loops=1)
              ->  Append  (cost=0.00..1080806.21 rows=81810 width=64) (actual time=1868.039..15237.331 rows=39526 loops=1)
                    ->  Seq Scan on p_apache_log  (cost=0.00..600.89 rows=1 width=40) (actual time=0.010..0.010 rows=0 loops=1)
                          Filter: ((vhost = 'kamoland.com'::text) AND (statuscode = ANY ('{200,206,304}'::text[])) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval)))
AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_13_p on p_apache_log_13 p_apache_log  (cost=0.02..11.14 rows=1 width=28) (actual time=81.347..81.347 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_14_p on p_apache_log_14 p_apache_log  (cost=0.02..11.15 rows=1 width=28) (actual time=72.648..72.648 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_15_p on p_apache_log_15 p_apache_log  (cost=0.02..11.14 rows=1 width=28) (actual time=79.597..79.597 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_16_p on p_apache_log_16 p_apache_log  (cost=0.02..11.14 rows=1 width=28) (actual time=79.352..79.352 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_17_p on p_apache_log_17 p_apache_log  (cost=0.02..11.19 rows=1 width=48) (actual time=67.322..67.322 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_18_p on p_apache_log_18 p_apache_log  (cost=0.02..11.22 rows=1 width=46) (actual time=76.467..76.467 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_19_p on p_apache_log_19 p_apache_log  (cost=0.02..11.27 rows=1 width=49) (actual time=73.714..73.714 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_20_p on p_apache_log_20 p_apache_log  (cost=0.02..11.25 rows=1 width=48) (actual time=59.570..59.570 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_21_p on p_apache_log_21 p_apache_log  (cost=0.02..11.26 rows=1 width=61) (actual time=62.291..62.291 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_22_p on p_apache_log_22 p_apache_log  (cost=0.02..11.31 rows=1 width=64) (actual time=61.634..61.634 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Bitmap Heap Scan on p_apache_log_23 p_apache_log  (cost=12607.70..1079959.72 rows=81787 width=60) (actual time=1154.020..13341.304 rows=39526 loops=1)
                          Recheck Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          ->  Bitmap Index Scan on p_apache_log_23_p  (cost=0.00..12587.25 rows=365020 width=0) (actual time=1140.830..1140.830 rows=366568 loops=1)
                                Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_1_p on p_apache_log_1 p_apache_log  (cost=0.02..11.08 rows=1 width=30) (actual time=63.055..63.055 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_2_p on p_apache_log_2 p_apache_log  (cost=0.02..11.10 rows=1 width=30) (actual time=76.574..76.574 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_3_p on p_apache_log_3 p_apache_log  (cost=0.02..11.13 rows=1 width=31) (actual time=68.027..68.027 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_4_p on p_apache_log_4 p_apache_log  (cost=0.02..11.13 rows=1 width=32) (actual time=85.115..85.115 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_5_p on p_apache_log_5 p_apache_log  (cost=0.02..11.12 rows=1 width=28) (actual time=83.645..83.645 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_6_p on p_apache_log_6 p_apache_log  (cost=0.02..11.13 rows=1 width=31) (actual time=63.053..63.053 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_7_p on p_apache_log_7 p_apache_log  (cost=0.02..11.14 rows=1 width=31) (actual time=113.322..113.322 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_8_p on p_apache_log_8 p_apache_log  (cost=0.02..11.14 rows=1 width=32) (actual time=76.876..76.876 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_9_p on p_apache_log_9 p_apache_log  (cost=0.02..11.14 rows=1 width=29) (actual time=75.564..75.564 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_10_p on p_apache_log_10 p_apache_log  (cost=0.02..11.14 rows=1 width=29) (actual time=75.900..75.900 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_11_p on p_apache_log_11 p_apache_log  (cost=0.02..11.14 rows=1 width=28) (actual time=72.422..72.422 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Index Scan using p_apache_log_12_p on p_apache_log_12 p_apache_log  (cost=0.02..11.14 rows=1 width=28) (actual time=110.836..110.836 rows=0 loops=1)
                          Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.026..0.026 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
Total runtime: 16002.931 ms
パーティション化した方はやたらと実行計画が長いが,しかしめでたく実行時間は速くなっている.パーティショニングによって,62秒から16秒に短縮された.

一体何でこのような実行時間の違いが出たのか,実行計画を見比べてみる.

非パーティショニングの場合,実データ取得部の抜粋

->  Bitmap Heap Scan on apache_log  (cost=17089.58..1404142.30 rows=73723 width=43) (actual time=2825.112..62045.935 rows=39446 loops=1)
      Recheck Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
      Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
      ->  Bitmap Index Scan on apache_log_p  (cost=0.00..17071.15 rows=339037 width=0) (actual time=2750.505..2750.505 rows=365057 loops=1)
            Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))

パーティショニングの場合,実データ取得部の抜粋

->  Bitmap Heap Scan on p_apache_log_23 p_apache_log  (cost=12607.70..1079959.72 rows=81787 width=60) (actual time=1154.020..13341.304 rows=39526 loops=1)
      Recheck Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
      Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
      ->  Bitmap Index Scan on p_apache_log_23_p  (cost=0.00..12587.25 rows=365020 width=0) (actual time=1140.830..1140.830 rows=366568 loops=1)
            Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= date_trunc('month'::text, (now() - '1 day'::interval))) AND (reqtime < (date_trunc('month'::text, (now() - '1 day'::interval)) + '1 mon'::interval)))
どちらもBitmap Heap Scanでほとんどの時間を消費しており,その時間の違いが全体の実行時間の違いに現れている.

まぁBitmap Heap ScanはBitmap Index Scanの宿命なので,仕方ないと言えば仕方ないですわな. (ちなみに,なぜIndex ScanではなくBitmap Index Scanなのかの理由は,見積もり件数が多い場合にそうなるようだ.そしてBitmap Index Scanの場合は, 取得順序が物理順ではないため実データを取得するために上位でBitmap Heap Scanが必要になるとのこと)

結局,

  • テーブルがでかい時に,Bitmap Heap Scanが遅くなる

というのが非パーティショニングの場合の遅さの原因だという推測に至った. パーティショニングしていれば各テーブルは小さく保たれるので,この遅さを回避できると.

...それはいいとして,パーティショニング時の実行計画がやたら長いことからわかるように, 何と,データが入っているパーティション「p_apache_log_23」以外のパーティションにも,Index Scanがかかっているではないか!

CEが効いた状態

最高の検索性能を得るためには,やはり必要なパーティションだけに絞ってアクセスすべきだろう.CE(Constraint Exclusion)によるパーティションプルーニング.

postgresql.confの設定は,ちゃんと

constraint_exclusion = on
にしているので問題ないはずだが...(デフォルトはoff)

調べたところ,パーティションキー(今回はreqtime)にPREPAREでバインドパラメータを使った場合は,CEが使えないようだ. 理由は,実行計画を決めるときにパーティションキーが不明だと,どのパーティションを見ればよいのかがわからないため.

今回のSQLでは,パーティションキーの条件がバインドパラメータという訳ではないのだが, しかしcurrent_timestampというSTABLE属性の関数なので,実行計画を決めるときに不定であることに違いはない.ぎゃぼ.

...結局、CEが効くのは、

  • パーティションキーが定数値または、定数値によるIMMUTABLE関数呼び出しの場合

となるのか?

とりあえず性能がどうなるかを見たいので,reqtimeの条件を「SQLに定数でべた書き」するように変更して実行してみた.

非パーティション

----- 実行計画 -----
Sort  (cost=1384433.78..1384434.24 rows=184 width=43) (actual time=59293.743..59293.818 rows=28 loops=1)
  Sort Key: (date_trunc('day'::text, apache_log.reqtime))
  Sort Method:  quicksort  Memory: 18kB
  ->  HashAggregate  (cost=1384421.33..1384426.85 rows=184 width=43) (actual time=59293.478..59293.593 rows=28 loops=1)
        ->  Bitmap Heap Scan on apache_log  (cost=16860.48..1383693.99 rows=72734 width=43) (actual time=1988.979..58839.242 rows=39446 loops=1)
              Recheck Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= '2008-09-01 00:00:00'::timestamp without time zone) AND (reqtime < '2008-10-01 00:00:00'::timestamp without time zone))
              Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
              ->  Bitmap Index Scan on apache_log_p  (cost=0.00..16842.30 rows=334490 width=0) (actual time=1923.959..1923.959 rows=365057 loops=1)
                    Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= '2008-09-01 00:00:00'::timestamp without time zone) AND (reqtime < '2008-10-01 00:00:00'::timestamp without time zone))
              SubPlan
                ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.027..0.027 rows=0 loops=55137)
                      Filter: ($1 ~~ ex_pattern)
                ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=351568)
                      Filter: ($0 ~~ ex_pattern)
Total runtime: 59294.227 ms
パーティション化
----- 実行計画 -----
Sort  (cost=1061163.58..1061164.08 rows=200 width=60) (actual time=14371.107..14371.176 rows=28 loops=1)
  Sort Key: (date_trunc('day'::text, public.p_apache_log.reqtime))
  Sort Method:  quicksort  Memory: 18kB
  ->  HashAggregate  (cost=1061149.94..1061155.94 rows=200 width=60) (actual time=14370.840..14370.957 rows=28 loops=1)
        ->  Result  (cost=0.00..1060343.32 rows=80662 width=60) (actual time=629.614..13976.920 rows=39526 loops=1)
              ->  Append  (cost=0.00..1059940.01 rows=80662 width=60) (actual time=629.582..13560.505 rows=39526 loops=1)
                    ->  Seq Scan on p_apache_log  (cost=0.00..597.21 rows=1 width=40) (actual time=0.006..0.006 rows=0 loops=1)
                          Filter: ((reqtime >= '2008-09-01 00:00:00'::timestamp without time zone) AND (reqtime < '2008-10-01 00:00:00'::timestamp without time zone)
AND (vhost = 'kamoland.com'::text) AND (statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.027..0.027 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
                    ->  Bitmap Heap Scan on p_apache_log_23 p_apache_log  (cost=12432.58..1059342.80 rows=80661 width=60) (actual time=629.567..13336.685 rows=39526 loops=1)
                          Recheck Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= '2008-09-01 00:00:00'::timestamp without time zone) AND (reqtime < '2008-10-01 00:00:00'::timestamp without time zone))
                          Filter: ((statuscode = ANY ('{200,206,304}'::text[])) AND (NOT (subplan)) AND (NOT (subplan)))
                          ->  Bitmap Index Scan on p_apache_log_23_p  (cost=0.00..12412.41 rows=359994 width=0) (actual time=604.247..604.247 rows=366568 loops=1)
                                Index Cond: ((vhost = 'kamoland.com'::text) AND (reqtime >= '2008-09-01 00:00:00'::timestamp without time zone) AND (reqtime < '2008-10-01 00:00:00'::timestamp without time zone))
                          SubPlan
                            ->  Seq Scan on al_useragent_ex  (cost=0.00..1.40 rows=1 width=0) (actual time=0.027..0.027 rows=0 loops=55392)
                                  Filter: ($1 ~~ ex_pattern)
                            ->  Seq Scan on al_target_ex  (cost=0.00..1.38 rows=1 width=0) (actual time=0.022..0.022 rows=1 loops=352982)
                                  Filter: ($0 ~~ ex_pattern)
Total runtime: 14371.685 ms
CEが効いた!そして,パーティション有無に関わらず,どちらも速くなった.しかし,それほどでも無いような...

まとめ

今回の検証SQLの所要時間(sec)
検索条件非パーティションパーティション
CE無効6216
CE有効5914

パーティショニングしてCE有効というのが,やはり最速でした.

しかし結局,カモランドのログ処理プログラムでは,CEを有効にするためにプログラムを書き換えるほどの長所を感じないので, CE無効のままで運用しています.


© 2024 KMIソフトウェア