Using built-in statistical functions

SELECT 
    min(numage),
    avg(numage),
    max(numage)
FROM SN7577;

The Distinct keyword

  • For the SN7577 table the allowable values in many of the columns are listed in the SN7577 document.
  • But this doesn’t mean that they all actually appear in the data.

  • To obtain a list of umique values in a particular column you can use the Distinct keyword.
  • We will use the text version of the SN7577 table for these examples.

  • For Q1 How would you vote if there were a General Election tomorrow? there are 11 possibilities listed and another (-1) for a missing value.

To find out which values are in the data we can use the query:

SELECT DISTINCT Q1
FROM SN7577_Text;
  • We can see from the reults of running this that all 11 values are represented and that there is no missing data in this field.

  • You can have more than one column name after the Distinct keyword.
  • the results will include a row for each unique combination of the columns involved

Exercise

Q3 asks for the likelihood of voting on a scale of 1 - 10 (10 - Absolutely certain to vote). Write a query which shows all of the combinations of voting intentions (Q1) and likelihood of voting (Q3).

Solution:

Sorting the results by Q1 gives the political parties in alphabetical order.

SELECT DISTINCT Q1, Q3
FROM SN7577_Text
ORDER BY Q1

The group by clause to summarise data

  • Just knowing the combinations is of limited use. You really want to know How many of each of the values there are.
  • To do this we use the Group By clause.
SELECT Q1,
       count(*) as Num_potential_voters
FROM SN7577_Text
GROUP BY Q1
ORDER BY Q1;
  • This query gives us a count of potential voters for each party.

  • three aggregation were performed over the single column Q1.
  • It is possible to aggregate over multiple columns by specifying them in both the select and the group by clause.

  • The grouping will take place based on the order of the columns listed in the group by clause.

  • What is not allowed is specifying a non-aggregated column in the select clause which is not mentioned in the group by clause.

Using the having clause

  • In order to filter the rows returned in a non-aggregated query we used the where clause.

  • For an aggregated query the equivalent is the having clause.
  • You use the having clause by providing it with a filter expression which references one of the aggregated columns.

  • In a having clause you can use the column alias to refer to the aggregated column.

SELECT Q1 ,
       sum(daily3) as Telegraph_reader
FROM SN7577 
GROUP BY Q1
HAVING Telegraph_reader > 5;
  • In this example aggregating Telegraph readers by voter intentions (Q1).

  • We are only interested in the groups where there are more than 5 ‘Telegraph’ readers.

Exercise

In the UK the ‘Telegraph’ is regarded as a right-wing newspaper and the Conservatives are considered to be a right-wing political party.

Do the results of the query above support this?

The ‘Mirror’ is a left-wing newspaper and Labour are considered to be a left-wing political party. Re-write the query above to see which group of supporters are more likely to read the Mirror

You can browse the Newspapers table to find out which of the daily columns refers to the Mirror

Solution:

SELECT Q1 ,
       sum(daily12) as Mirror_reader
FROM SN7577 
GROUP BY Q1
HAVING Mirror_reader > 5;
LS0tCnRpdGxlOiAiQWdncmVnYXRpb25zIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgoKIyBVc2luZyBidWlsdC1pbiBzdGF0aXN0aWNhbCBmdW5jdGlvbnMKCiogQWdncmVnYXRlIGZ1bmN0aW9ucyBhcmUgdXNlZCBwZXJmb3JtIHNvbWUga2luZCBvZiBtYXRoZW1hdGljYWwgb3Igc3RhdGlzdGljYWwgY2FsY3VsYXRpb24gYWNyb3NzIGEgZ3JvdXAgb2Ygcm93cy4gCiogVGhlIHJvd3MgaW4gZWFjaCBncm91cCBhcmUgZGV0ZXJtaW5lZCBieSB0aGUgZGlmZmVyZW50IHZhbHVlcyBpbiBhIHNwZWNpZmllZCBjb2x1bW4gb3IgY29sdW1ucy4gCiogQWx0ZXJuYXRpdmVseSB5b3UgY2FuIGFnZ3JlZ2F0ZSBhY3Jvc3MgdGhlIGVudGlyZSB0YWJsZS4KCgoqIElmIHdlIHdhbnRlZCB0byBrbm93IHRoZSBgbWluaW11bWAsIGBhdmVyYWdlYCBhbmQgYG1heGltdW0gYWdlYCB2YWx1ZXMgYChudW1hZ2UpYCBhY3Jvc3MgdGhlIHdob2xlIFNONzU3NyB0YWJsZSB3ZSBjb3VsZCB3cml0ZSBhIHF1ZXJ5IHN1Y2ggYXMgdGhpczoKCmBgYHNxbApTRUxFQ1QgCiAgICBtaW4obnVtYWdlKSwKICAgIGF2ZyhudW1hZ2UpLAogICAgbWF4KG51bWFnZSkKRlJPTSBTTjc1Nzc7CgpgYGAKCiogVGhpcyBzb3J0IG9mIHF1ZXJ5IHByb3ZpZGVzIHVzIHdpdGggYSBnZW5lcmFsIHZpZXcgb2YgdGhlIHZhbHVlcyBmb3IgYSBwYXJ0aWN1bGFyIGNvbHVtbiBvciBmaWVsZCBhY3Jvc3MgdGhlIHdob2xlIHRhYmxlLgoKKiBgbWluYCAsIGBtYXhgIGFuZCBgYXZnYCBhcmUgYnVpbHRpbiBhZ2dyZWdhdGUgZnVuY3Rpb25zIGluIFNRTGl0ZSAoYW5kIGFueSBvdGhlciBTUUwgZGF0YWJhc2Ugc3lzdGVtKS4gCiogVGhlcmUgYXJlIG90aGVyIHN1Y2ggZnVuY3Rpb25zIGF2YWlhbGFibGUuIApodHRwczovL3NxbGl0ZS5vcmcvbGFuZ19hZ2dmdW5jLmh0bWwKCiogSXQgaXMgbW9yZSBsaWtlbHkgdGhhdCB3ZSB3b3VsZCB3YW50IHRvIGZpbmQgc3VjaCB2YWx1ZXMgZm9yIGEgcmFuZ2UsIG9yIG11bHRpcGxlIHJhbmdlcyBvZiByb3dzIHdoZXJlIGVhY2ggcmFuZ2UgaXMgZGV0ZXJtaW5lZCBieSB0aGUgdmFsdWVzIG9mIHNvbWUgb3RoZXIgY29sdW1uIGluIHRoZSB0YWJsZS4gCiogQmVmb3JlIHdlIGRvIHRoaXMgd2Ugd2lsbCBsb29rIGF0IGhvdyB3ZSBjYW4gZmluZCBvdXQgd2hhdCBkaWZmZXJlbnQgdmFsdWVzIGFyZSBjb250YWluZWQgaW4gYSBnaXZlbiBjb2x1bW4uCgojIyBUaGUgYERpc3RpbmN0YCBrZXl3b3JkCgoqIEZvciB0aGUgU043NTc3IHRhYmxlIHRoZSBhbGxvd2FibGUgdmFsdWVzIGluIG1hbnkgb2YgdGhlIGNvbHVtbnMgYXJlIGxpc3RlZCBpbiB0aGUgU043NTc3IGRvY3VtZW50LiAKKiBCdXQgdGhpcyBkb2VzbuKAmXQgbWVhbiB0aGF0IHRoZXkgYWxsIGFjdHVhbGx5IGFwcGVhciBpbiB0aGUgZGF0YS4KCiogVG8gb2J0YWluIGEgbGlzdCBvZiB1bWlxdWUgdmFsdWVzIGluIGEgcGFydGljdWxhciBjb2x1bW4geW91IGNhbiB1c2UgdGhlIGBEaXN0aW5jdGAga2V5d29yZC4gCiogV2Ugd2lsbCB1c2UgdGhlIHRleHQgdmVyc2lvbiBvZiB0aGUgU043NTc3IHRhYmxlIGZvciB0aGVzZSBleGFtcGxlcy4KCiogRm9yIFExIGBIb3cgd291bGQgeW91IHZvdGUgaWYgdGhlcmUgd2VyZSBhIEdlbmVyYWwgRWxlY3Rpb24gdG9tb3Jyb3c/YCB0aGVyZSBhcmUgMTEgcG9zc2liaWxpdGllcyBsaXN0ZWQgYW5kIGFub3RoZXIgKC0xKSBmb3IgYSBtaXNzaW5nIHZhbHVlLgoKVG8gZmluZCBvdXQgd2hpY2ggdmFsdWVzIGFyZSBpbiB0aGUgZGF0YSB3ZSBjYW4gdXNlIHRoZSBxdWVyeToKCmBgYHNxbApTRUxFQ1QgRElTVElOQ1QgUTEKRlJPTSBTTjc1NzdfVGV4dDsKYGBgCgoqIFdlIGNhbiBzZWUgZnJvbSB0aGUgcmV1bHRzIG9mIHJ1bm5pbmcgdGhpcyB0aGF0IGFsbCAxMSB2YWx1ZXMgYXJlIHJlcHJlc2VudGVkIGFuZCB0aGF0IHRoZXJlIGlzIG5vIG1pc3NpbmcgZGF0YSBpbiB0aGlzIGZpZWxkLgoKKiBZb3UgY2FuIGhhdmUgbW9yZSB0aGFuIG9uZSBjb2x1bW4gbmFtZSBhZnRlciB0aGUgYERpc3RpbmN0YCBrZXl3b3JkLiAKKiB0aGUgcmVzdWx0cyB3aWxsIGluY2x1ZGUgYSByb3cgZm9yIGVhY2ggdW5pcXVlIGNvbWJpbmF0aW9uIG9mIHRoZSBjb2x1bW5zIGludm9sdmVkCgojIyBFeGVyY2lzZSAKUTMgYXNrcyBmb3IgdGhlIGxpa2VsaWhvb2Qgb2Ygdm90aW5nIG9uIGEgc2NhbGUgb2YgMSAtIDEwICgxMCAtIEFic29sdXRlbHkgY2VydGFpbiB0byB2b3RlKS4gV3JpdGUgYSBxdWVyeSB3aGljaCBzaG93cyBhbGwgb2YgdGhlIGNvbWJpbmF0aW9ucyBvZiB2b3RpbmcgaW50ZW50aW9ucyAoUTEpIGFuZCBsaWtlbGlob29kIG9mIHZvdGluZyAoUTMpLgoKIyMjIFNvbHV0aW9uOgpTb3J0aW5nIHRoZSByZXN1bHRzIGJ5IFExIGdpdmVzIHRoZSBwb2xpdGljYWwgcGFydGllcyBpbiBhbHBoYWJldGljYWwgb3JkZXIuCmBgYHNxbApTRUxFQ1QgRElTVElOQ1QgUTEsIFEzCkZST00gU043NTc3X1RleHQKT1JERVIgQlkgUTEKCmBgYAoKIyMgVGhlIGBncm91cCBieWAgY2xhdXNlIHRvIHN1bW1hcmlzZSBkYXRhCgoqIEp1c3Qga25vd2luZyB0aGUgY29tYmluYXRpb25zIGlzIG9mIGxpbWl0ZWQgdXNlLiBZb3UgcmVhbGx5IHdhbnQgdG8ga25vdyBgSG93IG1hbnlgIG9mIGVhY2ggb2YgdGhlIHZhbHVlcyB0aGVyZSBhcmUuIAoqIFRvIGRvIHRoaXMgd2UgdXNlIHRoZSBgR3JvdXAgQnlgIGNsYXVzZS4KCmBgYHNxbApTRUxFQ1QgUTEsCiAgICAgICBjb3VudCgqKSBhcyBOdW1fcG90ZW50aWFsX3ZvdGVycwpGUk9NIFNONzU3N19UZXh0CkdST1VQIEJZIFExCk9SREVSIEJZIFExOwoKYGBgCgoqIFRoaXMgcXVlcnkgZ2l2ZXMgdXMgYSBjb3VudCBvZiBwb3RlbnRpYWwgdm90ZXJzIGZvciBlYWNoIHBhcnR5LgoKKiB0aHJlZSBhZ2dyZWdhdGlvbiB3ZXJlIHBlcmZvcm1lZCBvdmVyIHRoZSBzaW5nbGUgY29sdW1uIFExLiAKKiBJdCBpcyBwb3NzaWJsZSB0byBhZ2dyZWdhdGUgb3ZlciBtdWx0aXBsZSBjb2x1bW5zIGJ5IHNwZWNpZnlpbmcgdGhlbSBpbiBib3RoIHRoZSBzZWxlY3QgYW5kIHRoZSBncm91cCBieSBjbGF1c2UuCgoqIFRoZSBncm91cGluZyB3aWxsIHRha2UgcGxhY2UgYmFzZWQgb24gdGhlIG9yZGVyIG9mIHRoZSBjb2x1bW5zIGxpc3RlZCBpbiB0aGUgZ3JvdXAgYnkgY2xhdXNlLgoKKiBXaGF0IGlzIG5vdCBhbGxvd2VkIGlzIHNwZWNpZnlpbmcgYSBub24tYWdncmVnYXRlZCBjb2x1bW4gaW4gdGhlIHNlbGVjdCBjbGF1c2Ugd2hpY2ggaXMgbm90IG1lbnRpb25lZCBpbiB0aGUgZ3JvdXAgYnkgY2xhdXNlLgoKIyMgVXNpbmcgdGhlIGBoYXZpbmdgIGNsYXVzZQoKKiBJbiBvcmRlciB0byBmaWx0ZXIgdGhlIHJvd3MgcmV0dXJuZWQgaW4gYSBub24tYWdncmVnYXRlZCBxdWVyeSB3ZSB1c2VkIHRoZSBgd2hlcmVgIGNsYXVzZS4gCgoqIEZvciBhbiBhZ2dyZWdhdGVkIHF1ZXJ5IHRoZSBlcXVpdmFsZW50IGlzIHRoZSBgaGF2aW5nYCBjbGF1c2UuCiogWW91IHVzZSB0aGUgYGhhdmluZ2AgY2xhdXNlIGJ5IHByb3ZpZGluZyBpdCB3aXRoIGEgZmlsdGVyIGV4cHJlc3Npb24gd2hpY2ggcmVmZXJlbmNlcyBvbmUgb2YgdGhlIGFnZ3JlZ2F0ZWQgY29sdW1ucy4KCiogSW4gYSBgaGF2aW5nYCBjbGF1c2UgeW91IGNhbiB1c2UgdGhlIGNvbHVtbiBhbGlhcyB0byByZWZlciB0byB0aGUgYWdncmVnYXRlZCBjb2x1bW4uCgpgYGBzcWwKU0VMRUNUIFExICwKICAgICAgIHN1bShkYWlseTMpIGFzIFRlbGVncmFwaF9yZWFkZXIKRlJPTSBTTjc1NzcgCkdST1VQIEJZIFExCkhBVklORyBUZWxlZ3JhcGhfcmVhZGVyID4gNTsKCmBgYAoKKiBJbiB0aGlzIGV4YW1wbGUgYWdncmVnYXRpbmcgYFRlbGVncmFwaCByZWFkZXJzYCBieSB2b3RlciBpbnRlbnRpb25zIGAoUTEpYC4KCiogV2UgYXJlIG9ubHkgaW50ZXJlc3RlZCBpbiB0aGUgZ3JvdXBzIHdoZXJlIHRoZXJlIGFyZSBtb3JlIHRoYW4gNSDigJhUZWxlZ3JhcGjigJkgcmVhZGVycy4KCiMjIEV4ZXJjaXNlIApJbiB0aGUgVUsgdGhlIOKAmFRlbGVncmFwaOKAmSBpcyByZWdhcmRlZCBhcyBhIHJpZ2h0LXdpbmcgbmV3c3BhcGVyIGFuZCB0aGUgQ29uc2VydmF0aXZlcyBhcmUgY29uc2lkZXJlZCB0byBiZSBhIHJpZ2h0LXdpbmcgcG9saXRpY2FsIHBhcnR5LgoKRG8gdGhlIHJlc3VsdHMgb2YgdGhlIHF1ZXJ5IGFib3ZlIHN1cHBvcnQgdGhpcz8KClRoZSDigJhNaXJyb3LigJkgaXMgYSBsZWZ0LXdpbmcgbmV3c3BhcGVyIGFuZCBMYWJvdXIgYXJlIGNvbnNpZGVyZWQgdG8gYmUgYSBsZWZ0LXdpbmcgcG9saXRpY2FsIHBhcnR5LiBSZS13cml0ZSB0aGUgcXVlcnkgYWJvdmUgdG8gc2VlIHdoaWNoIGdyb3VwIG9mIHN1cHBvcnRlcnMgYXJlIG1vcmUgbGlrZWx5IHRvIHJlYWQgdGhlIE1pcnJvcgoKWW91IGNhbiBicm93c2UgdGhlIE5ld3NwYXBlcnMgdGFibGUgdG8gZmluZCBvdXQgd2hpY2ggb2YgdGhlIGRhaWx5IGNvbHVtbnMgcmVmZXJzIHRvIHRoZSBNaXJyb3IKCiMjIyBTb2x1dGlvbjoKYGBgc3FsClNFTEVDVCBRMSAsCiAgICAgICBzdW0oZGFpbHkxMikgYXMgTWlycm9yX3JlYWRlcgpGUk9NIFNONzU3NyAKR1JPVVAgQlkgUTEKSEFWSU5HIE1pcnJvcl9yZWFkZXIgPiA1OwpgYGAKCgoKCgoKCgoKCgo=