Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Selecting and ordering data over multiple date ranges

by peppiv (Curate)
on Feb 07, 2003 at 15:34 UTC ( [id://233493]=perlquestion: print w/replies, xml ) Need Help??

peppiv has asked for the wisdom of the Perl Monks concerning the following question:

Got a doozy here. Part of the web reporting that Mgmt wants includes a running list of subscribers to our newsletter.

Background - I'm running Linux/Apache with FreeTDS/DBD::Sybase connecting to a Win2K box with SQL Server 7.0.

Every time someone subscribes they put in their address (specifically the State they live in) and the DB includes the date/time they subscribed.

Sample record:
Col 1 - John Smith Col 2 - NY Col 3 - 02/06/2003

I print to screen the top ten states and the count of subscribers in descending order.

Example
Jan Feb Mar New York 115 0 0 Florida 34 0 0 Georgia 28 0 0


Here's how I do it.
my $sth = $dbh->prepare("SELECT DISTINCT state, COUNT(*) FROM con +tacts WHERE state != '' AND identifier BETWEEN '2003-01-01' AND '2003 +-02-01' GROUP BY state ORDER BY 2 DESC"); $sth->execute() or die $sth->errstr; my $rows = 0; while (my @result = $sth->fetchrow_array()) { last if $rows++ >= 10; print qq(<tr><td><p style="margin-left: 10"><font size="2" face=" +Arial">$result[0]</font></td><td><div align="right"><font size="2" fa +ce="Arial">&nbsp;</font></div></td><td><font size="2" face="Arial"><d +iv align="right">$result[1]</div></font></td><td><div align="right">< +font size="2" face="Arial">0</font></div></td><tr>\n); } $sth->finish();


But how can I do this for the following months?
Here's what I need it to look like.

Jan Feb Mar New York 115 100 0 Georgia 28 45 0 Florida 34 10 0


I need the order to be listed by the current month and still show what happened the previous months.

Can anyone help point me in the right direction?

peppiv

Replies are listed 'Best First'.
Re: Selecting and ordering data over multiple date ranges
by gmax (Abbot) on Feb 07, 2003 at 16:55 UTC
    To get a cross tab, you can use the following technique. Only one query is needed.
    SELECT state, sum(IF(month(identifier)=1,1,0)) as 'Jan', sum(IF(month(identifier)=2,1,0)) as 'Feb', sum(IF(month(identifier)=3,1,0)) as 'Mar', count(*) as TOTAL from contacts WHERE state != '' GROUP BY state;
    And you should get a result like :
    +----------+-----+-----+----+-------+ | state | Jan | Feb | Mar| Total | +----------+-----+-----+----+-------+ | NY | 10 | 12 | 11 | 33 | | IL | 15 | 14 | 15 | 44 | | TX | 25 | 20 | 22 | 67 | +----------+-----+-----+----+-------+
    Check this article for a step-by-step tutorial on how to use Perl to get get the values for your columns in the query without doing it manually. The article is specific for MySQL, but the algorithm works on any DBMS.
    _ _ _ _ (_|| | |(_|>< _|
      Once again gmax you've made my day better. Check it out:
      "SELECT state, SUM(case when identifier BETWEEN '2003-01-01' AND '2003 +-02-01' then 1 else 0 end) AS 'Jan', SUM(case when identifier BETWEEN + '2003-02-01' AND '2003-03-01' then 1 else 0 end) AS 'Feb', COUNT(*) +as TOTAL FROM contacts WHERE state != '' GROUP BY state ORDER BY 3 DE +SC"


      The 'case' instead of IF works because it's on MSQl. Also, it requires the 'then 1 else 0 end'.

      Works like a charm! Soooopa Thanx

      peppiv

Re: Selecting and ordering data over multiple date ranges
by dws (Chancellor) on Feb 07, 2003 at 19:29 UTC
    You basically have data that you want to divide into buckets, with aggregate calculations performed on each bucket. One way to do this is to use a separate table to define the buckets, then JOIN against it to determine which bucket a date goes into, and GROUP BY the bucket so that aggregate calculations apply to the bucket.

    Consider a table "bucket", which looks like:

    +--------+------------+------------+ | name | begin | end | +--------+------------+------------+ | Jan | 2003-01-01 | 2003-01-31 | +--------+------------+------------+ | ... | +--------+------------+------------+ | Dec | 2003-12-01 | 2003-12-30 | +--------+------------+------------+
    Given this, you then issue a query like
    SELECT contacts.state, bucket.name, count(*) FROM contacts, bucket WHERE contacts.indentifer BETWEEN bucket.begin AND bucket.end GROUP BY bucket.name, contacts.state ORDER BY contacts.state
    This'll give you a set of (state, month, count) tuples that you might need to massage a bit before emitting as an HTML table. You may find it convenient to extend the query to use an outer join so that you'll get tuples for months with no contact events.

    I haven't tested the query, but I do something similar in another context to divide data into buckets by date.

    The nice thing about this approach is that you can reload the bucket table with other data to get different groupings.

    +--------+------------+------------+ | name | begin | end | +--------+------------+------------+ | 1Q03 | 2003-01-01 | 2003-03-31 | +--------+------------+------------+ | ... | +--------+------------+------------+ | 4Q03 | 2003-10-01 | 2003-12-30 | +--------+------------+------------+
    will give you a breakdown by quarters, without changing the query.

Re: Selecting and ordering data over multiple date ranges
by steves (Curate) on Feb 07, 2003 at 15:52 UTC

    One way would be to GROUP BY state and month -- not just state, but your output would change to rows like this:

    New York Jan 115 New York Feb 100 New York Mar 0 Georgia Jan 28

      If you save these results in a HashofHashes (keyed by the name of the State and the month), the reformatting into a table with State-rows and Month-columns will be easy (and left for the attentive student as an exercise).

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Selecting and ordering data over multiple date ranges
by steves (Curate) on Feb 07, 2003 at 16:25 UTC

    I was thinking you might be able to get the headings as desired by using in-line views if your DB supports them. I tried this sort of thing against an Oracle DB here. But it was taking so long I killed it:

    SELECT o.state, dec.total, jan.total, feb.total, count(*) FROM ( SELECT state, to_char(order_date, 'Mon'), count(*) total FROM web_order WHERE order_date BETWEEN '01-DEC-02' AND '31-DEC-02' GROUP BY state, to_char(order_date, 'Mon') ) dec, ( SELECT state, to_char(order_date, 'Mon'), count(*) total FROM web_order WHERE order_date BETWEEN '01-JAN-03' AND '31-JAN-03' GROUP BY state, to_char(order_date, 'Mon') ) jan, ( SELECT state, to_char(order_date, 'Mon'), count(*) total FROM web_order WHERE order_date BETWEEN '01-FEB-03' AND '28-FEB-03' GROUP BY state, to_char(order_date, 'Mon') ) feb, web_order o WHERE o.order_date BETWEEN '01-DEC-02' AND '28-FEB-03' GROUP BY o.state, dec.total, jan.total, feb.total

    As an alternative, it may be easier to do the state/month grouping and reformat yourself after getting the rows the DB hands back. I personally find Perl much better at more complex reformatting than most DB's.

      Yeah, I may try the state/month grouping and reformat with Perl. Friday's never a good day to have to use your brain.

      Thanks steves

      peppiv

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://233493]
Approved by bronto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-25 15:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found