$NM | Where State -ne WA | group -property Contributor,State,Amount -noelement | sort -desc Count,Name | ft -auto -wrap
Count Name
----- ----
1 WEBB LISA, MT, 100
1 VASKAS JANET, PA, 100
1 VASKAS ALAN, PA, 100
1 TURNER ZACHARY M, CO, 900
1 TERESA JUDITH, WV, 50
1 SOTO JLEANA, CA, 100
1 MCCLENDON SUSAN, GA, 500
1 ATU, DC, 900
Powershell 3.0 is flexible in working with data. After importing in candidate spreadsheet data, we want to ask some questions:
- How many donations did each candidate receive greater than or equal to $800?
- How many donations did each candidate receive less than or equal to $100?
# import data and create variable list
sl C:\users\rferrisx\Downloads
$date="10.22.2012"
if ($out) {rv -ea 0 out}
$NM= import-csv .\NM.$date.csv
$MDK= import-csv .\MDK.$date.csv
$JEO= import-csv .\JEO.$date.csv
$VKB= import-csv .\VKB.$date.csv
$varlist=foreach ($i in @("NM","MDK","JEO","VKB")) {ls variable:/$i}
rv -ea 0 a
# create one database stored as a variable (e.g. '$out') by merging all candidate donations.
# Add a 'member' or field (e.g. Candidate Name) to each record
$out+=$Varlist.name | % {
$a=(ls variable:/$PSItem).value;
$Name=(ls variable:/$PSItem).name;
$a | add-member -force -passthru -NotePropertyName Candidate -NotePropertyValue $PSItem;
}
# An $out record now is
$out[0]
Contributor : ATU
Date : 06/22/12
Amount : 900
P/G : P
City : WASHINGTON
State : DC
Zip : 20016
Employer :
Occupation :
Candidate : NM
Now we want ordered hash tables to answer our questions:
foreach ($i in $varlist.Name) {$ge800+=[ordered]@{$i=$(($out | ? {$_.Amount -ge "800"} | ? {$_.Candidate -eq $i}).count)};}
$ge800
Name Value
NM 9
MDK 13
JEO 31
VKB 37
foreach ($i in $varlist.Name) {$le100+=[ordered]@{$i=$(($out | ? {$_.Amount -le "100"} | ? {$_.Candidate -eq $i}).count)};}
$le100
Name Value
NM 136
MDK 115
JEO 43
VKB 34
Using .NET 4.0 charting, our last ordered hashtable becomes a chart:
$hashdata=$le100
Chart-HashData RangeColumn 600 600 "Candidate Donations less than or equal to `$100" "Candidate" "Number of Small Donations"
A full script for filtering 'out of state' contributions into two graphs and charts is below. Here are some prerequisites for working with 'dirty data'.
The CSV from WA PDC data sometimes needs to be cleaned up before importing:
- top summaries removed
- special characters not imported
- leading spaces removed for all columns
- zipcodes standardized to five or nine characters
To remove leading spaces and header lines I use a regex query in Open Office (e.g. '^ ').
$JEO[0]
Contributor : EADES KASHA
Date : 05/08/12
Amount : 900
P/G : P
City : BELLINGHAM
State : WA
Zip : 98226
Employer :
Occupation : HOMEMAKER
In Powershell 3.0 this gives us such that (for sum of all amounts for any candidate):
($JEO.Amount | measure-object -sum).sum
62761.9
$Candidate_States give us data that excludes all contributions from WA. Like this:
$JEO_States=$JEO | Where State -ne WA |group -property State,Amount -noelement | sort -desc Count,Name
$JEO_States
Count Name
----- ----
2 VA, 900
2 VA, 800
2 TX, 250
1 TX, 900
1 TX, 800
1 TX, 500
1 TX, 400
1 TX, 350
1 PA, 400
1 OR, 900
1 OR, 500
1 NY, 900
1 NJ, 500
1 NJ, 300
1 NE, 250
1 MO, 900
1 MO, 800
1 FL, 500
1 DC, 400
1 CA, 800
1 CA, 300
1 CA, 250
1 CA, 100
1 AR, 800
# Produces charts and graphs for Out of State Donations for Select Candidates from PDC WA data.
# Revise path for your own folders
# Revise file names for your candidates
# Set date from CSV imports
# Uses Chart-HashData.ps1 (See horizontal-logic.blogspot.com code repository)
# Assume csv PDC data has been cleaned up, special characters not imported, leading spaces removed,zipcodes normalized to either 5 or 9 digits
sl C:\users\rferrisx\Downloads
$date="10.22.2012"
[array]$NM= import-csv .\NM.$date.csv
[array]$MDK= import-csv .\MDK.$date.csv
[array]$JEO= import-csv .\JEO.$date.csv
[array]$VKB= import-csv .\VKB.$date.csv
rv -ea 0 HashData
rv -ea 0 Total
$Global:NMStates=$NM | Where State -ne WA |group -property State,Amount -noelement | sort -desc Count,Name
$States=$NM | Where State -ne WA |group -property Amount -noelement | sort -desc Count
foreach ($i in $States) {$total+=$i.count * $i.name}
$HashData+=[ordered]@{(ls variable:/NM).Name=$Total;}
rv -ea 0 Total
$Global:MDKStates=$MDK | Where State -ne WA |group -property State,Amount -noelement | sort -desc Count,Name
$States=$MDK | Where State -ne WA |group -property Amount -noelement | sort -desc Count
foreach ($i in $States) {$total+=$i.count * $i.name}
$HashData+=[ordered]@{(ls variable:/MDK).Name=$Total;}
rv -ea 0 Total
$Global:JEOStates=$JEO | Where State -ne WA |group -property State,Amount -noelement | sort -desc Count,Name
$States=$JEO | Where State -ne WA |group -property Amount -noelement | sort -desc Count
foreach ($i in $States) {$total+=$i.count * $i.name}
$HashData+=[ordered]@{(ls variable:/JEO).Name=$Total;}
rv -ea 0 Total
$Global:VKBStates=$VKB | Where State -ne WA |group -property State,Amount -noelement | sort -desc Count,Name
$States=$VKB | Where State -ne WA |group -property Amount -noelement | sort -desc Count
foreach ($i in $States) {$total+=$i.count * $i.name}
$HashData+=[ordered]@{(ls variable:/VKB).Name=$Total;}
$HashData
Chart-HashData RangeColumn 500 500 "Total Amount Out of State Donations" "Candidate" "Amount" "SeeHowTheyFundRaise"
$PCT_NM=$HashData[0]/($NM.Amount | measure-object -sum).sum
$PCT_MDK=$HashData[1]/($MDK.Amount | measure-object -sum).sum
$PCT_JEO=$HashData[2]/($JEO.Amount | measure-object -sum).sum
$PCT_VKB=$HashData[3]/($VKB.Amount | measure-object -sum).sum
rv -ea 0 hashdata
$HashData+=[ordered]@{(ls variable:/NM).Name=[float]($PCT_NM * 100);}
$HashData+=[ordered]@{(ls variable:/MDK).Name=[float]($PCT_MDK * 100);}
$HashData+=[ordered]@{(ls variable:/JEO).Name=[float]($PCT_JEO * 100);}
$HashData+=[ordered]@{(ls variable:/VKB).Name=[float]($PCT_VKB * 100);}
$HashData
Chart-HashData RangeColumn 500 500 "Out of State Percentage" "Candidate" "Percent of All Donations from Out of State" "SeeHowTheyFundRaise"
Useing Chart-HashData.ps1 for the .NET 4.0,charts look like this:
No comments:
Post a Comment