Oracle – Data Visualization Of DateTime Stamps – The Big Picture 24-hour View By Date Using PIVOT

One of the interesting aspects of working with data is how every person can look at the same thing and come to different conclusions. No one is right or wrong but a good visualization will lead to good decision-making.

One of the leading data visualization patterns is pivoting and pivoting is everyone’s first choice when they need to look at the “big picture”. I am no exception to the rule here. I use this pattern very often in my day to day job.

The boring old data output

Let us consider an example. Data does not get more mundane than this sample set that shows the audit log for user logins.

SELECT os_username, timestamp, action_name
FROM sys.dba_audit_trail
WHERE
    action_name LIKE 'LOGON'

This is a small subset of the results:

OS_USERNAME TIMESTAMP ACTION_NAME
ora_user 2/4/2013 8:19 LOGON
ora_user 2/4/2013 8:19 LOGON
ora_user 2/4/2013 8:19 LOGON
service_acct_user 2/4/2013 8:22 LOGON
service_acct_user 2/4/2013 8:23 LOGON
service_acct_user 2/4/2013 8:25 LOGON
service_acct_user 2/4/2013 8:27 LOGON
user_kevin 2/4/2013 8:47 LOGON
user_chris 2/4/2013 10:38 LOGON
USER_SMITH 2/4/2013 10:41 LOGON
user_travis 2/4/2013 11:01 LOGON
user_travis 2/4/2013 11:02 LOGON
user_grim 2/4/2013 14:39 LOGON
user_travis 2/4/2013 16:26 LOGON
USER_STEVE 2/4/2013 19:41 LOGON
service_account_user2 2/4/2013 21:02 LOGON
service_account_user2 2/4/2013 21:02 LOGON
service_account_user2 2/4/2013 21:02 LOGON
service_account_user2 2/4/2013 21:02 LOGON
…..Plus a lot more rows

As it stands, it is queryable and all the data is there but the results are not very useful. If I had something like the below, that would help.

The visual transformation – 24 hour view by user and date

Would it not be nice to have a map of the count of logins by the hour for each date data is available for, by each hour in the 24 hour period for each day by person.  Here is how it looks:

OS_USERNAME DATED 01′ 02′ 03′ 04′ 05′ 06′ 07′ 08′ 09′ 10′ 11′ 12′ 13′ 14′ 15′ 16′ 17′ 18′ 19′ 20′ 21′ 22′ 23′
USER_MIKE 5-Feb-13 1 1
USER_JEFF 5-Feb-13 1 2 1 1
USER_JEFF 6-Feb-13 1 3
USER_TOM 4-Feb-13 1 1 2 3
USER_TOM 5-Feb-13 2 1 2
USER_TOM 6-Feb-13 2 1 2 1 1 1
USER_TOM 7-Feb-13 1 2 1 1
USER_TOM 8-Feb-13 1 2 2 1
SYSTEM 4-Feb-13 1
SYSTEM 6-Feb-13 1
USER_AVAIL 4-Feb-13 2 4 2 2 2 4
USER_AVAIL 5-Feb-13 2 4 6 4 4
USER_AVAIL 6-Feb-13 5 2 2 6 4 2 2 4
USER_AVAIL 7-Feb-13 2 2 6 2 2 4 2
USER_AVAIL 8-Feb-13 2 4 6 4 2 2
USER_BRENT 4-Feb-13 1 2 1 1 1 1
USER_BRENT 5-Feb-13 1 2 2 1 1
USER_BRENT 6-Feb-13 1 1 1 3 1 1
USER_BRENT 7-Feb-13 1 2 3
USER_BRENT 8-Feb-13 1 1 1 2 1
USER_DAMIEN 5-Feb-13 2 3 1 3 2 4 7 2 1
USER_DAMIEN 6-Feb-13 1 3 2 3 2 4 1 5
USER_DAMIEN 7-Feb-13 2 4 2 1 1 3 6 4 1
USER_DAMIEN 8-Feb-13 3 5 4 1 2 2 1 2
batch_user 4-Feb-13 67 544 546 543 543 541 539 536 537 536 503 482 482 427 420 421
batch_user 5-Feb-13 420 420 476 480 480 474 551 543 544 508 523 511 533 520 501 480 474 452 423 427 371 372 360
batch_user 6-Feb-13 360 360 415 420 420 400 482 483 478 471 470 458 476 466 458 456 462 453 422 422 372 363 360
batch_user 7-Feb-13 362 361 415 421 419 393 474 486 477 447 469 481 474 439 462 475 443 448 423 422 369 362 361
batch_user 8-Feb-13 361 360 414 420 420 433 482 484 483 483 483 483 490 481 482 484 305 96 60 60 7
batch_user 9-Feb-13 1 2 179 360 360 360 305 301 305 360 360 360 360 360 360 360 360 360 363
batch_user 10-Feb-13 398 423 420 420 420 422 422 422 402 362 361 386 397 366 382 372 402 421 437 384 365 386 362
batch_user 11-Feb-13 366 376 473 480 480 457 541 447
ora_user 4-Feb-13 53 193 197 174 174 166 166 172 146 160 152 146 153 147 146 146
ora_user 5-Feb-13 145 145 145 145 145 168 193 173 169 204 195 200 176 206 193 180 164 159 147 154 153 167 147
ora_user 6-Feb-13 155 155 145 145 145 154 170 180 167 179 209 197 182 183 171 167 154 160 149 149 160 150 166
ora_user 7-Feb-13 147 146 145 152 150 150 163 172 186 168 180 190 211 191 189 158 148 165 155 149 149 145 145
ora_user 8-Feb-13 145 145 144 145 145 154 148 159 162 187 176 171 159 168 187 160 147 146 145 145 148 145 94
ora_user 9-Feb-13 146 145 145 144 145 145 149 145 145 147 159 145 145 146 145 144 145 145 145 145 145 145 145
ora_user 10-Feb-13 145 145 145 145 145 152 148 145 145 145 145 145 144 146 145 145 145 147 177 179 149 144 145
ora_user 11-Feb-13 145 145 145 145 145 148 152 108
sys_user 4-Feb-13 3 79 78 79 79 79 78 79 71 78 79 79 79 79 78 79
sys_user 5-Feb-13 78 79 79 79 79 78 79 77 78 79 79 78 79 79 78 79 79 78 79 79 79 79 79
sys_user 6-Feb-13 79 79 78 78 79 79 79 78 79 79 79 79 78 79 79 79 79 78 79 79 79 79 79
sys_user 7-Feb-13 79 78 79 79 79 77 79 78 79 79 79 79 78 79 79 79 78 79 79 79 78 80 79
sys_user 8-Feb-13 79 79 78 79 79 79 79 79 78 79 78 78 79 79 79 84 78 79 79 79 79 79 47
sys_user 9-Feb-13 76 79 79 79 79 78 79 78 79 76 79 79 79 78 79 79 79 78 79 79 78 79 79
sys_user 10-Feb-13 78 79 79 79 79 79 79 79 79 78 79 79 79 79 78 79 79 79 79 78 78 79 78
sys_user 11-Feb-13 79 79 78 79 79 79 77 75
service_account_user2 4-Feb-13 41
service_account_user2 5-Feb-13 41
service_account_user2 6-Feb-13 41
service_account_user2 7-Feb-13 41
service_account_user2 8-Feb-13 41
service_account_user2 9-Feb-13 41
service_account_user2 10-Feb-13 41

I have only included a portion of the data to illustrate a few obvious things that emerge when looking at the big picture:

  • The accounts towards the top are not used much
  • The accounts towards the top are only used during work hours
  • The account batch_user is the most heavily used account and is used round the clock
  • Some accounts like service_account_user2 are used probably as part of a scheduled service that runs at a specific time every night
  • …plus a lot more that others can decide for themselves from looking at the data

Without PIVOT

In the years before, I have been using a query like this to get the data pivoted:

SELECT
    os_username,
  to_char(timestamp,'YYYY-MM-DD') dated,
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'00',1,0))) "00",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'01',1,0))) "01",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'02',1,0))) "02",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'03',1,0))) "03",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'04',1,0))) "04",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'05',1,0))) "05",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'06',1,0))) "06",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'07',1,0))) "07",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'08',1,0))) "08",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'09',1,0))) "09",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'10',1,0))) "10",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'11',1,0))) "11",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'12',1,0))) "12",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'13',1,0))) "13",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'14',1,0))) "14",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'15',1,0))) "15",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'16',1,0))) "16",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'17',1,0))) "17",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'18',1,0))) "18",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'19',1,0))) "19",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'20',1,0))) "20",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'21',1,0))) "21",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'22',1,0))) "22",
  round(sum(decode(substr(to_char(timestamp,'HH24'),1,2),'23',1,0))) "23"
FROM sys.dba_audit_trail
WHERE
    action_name LIKE 'LOGON'
GROUP BY os_username, TO_CHAR(timestamp,'YYYY-MM-DD')
ORDER BY os_username, TO_CHAR(timestamp,'YYYY-MM-DD')

This query gets the job done and techniques like this are abound on the internet but there is nothing particularly interesting about it.

With PIVOT

With 11g, there is no reason to be writing complex syntax like above. The same query when used with the PIVOT functionality in 11g, becomes:

SELECT *
FROM (
    SELECT
        os_username,
        TRUNC(timestamp) dated,
        SUBSTR(TO_CHAR(timestamp,'HH24'),1,2) tm
    FROM sys.dba_audit_trail
    WHERE
        action_name LIKE 'LOGON'
)
PIVOT
(
   SUM(1)
   FOR tm IN ('01','02','03','04','05', '06', '07',
                '08','09','10','11','12', '13',
                '14','15','16','17','18', '19',
                '20','21','22','23')
) a
ORDER BY os_username, dated

All I have done is relegated what goes into the columns between 00 and 23 to be done by the PIVOT clause.

In this case I am doing an aggregation of SUM. It can by any other aggregation and it has to be an aggregation. The group by column specified in the “FOR” clause for PIVOTing is “tm” which is part of the main SELECT. The results are then ordered by what is specified in the ORDER BY clause.

Other applications

The example above showed how to get a daily 24 hour PIVOT visualization. It is just as easy to
• Get a monthly view
• A quarterly view
• An yearly view
• …add more as your imagination permits

Just as it was applied to DateTime stamp column you may apply this to any column.

UNPIVOT works the other way round where you can convert data that is in a pivoted form to the mundane row form for easier querying and manipulation.

How to do this in SQL Server?

If you are looking to do something like this with SQL Server, please checkout my post on that.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s