This week's blog post was brought to you by Mail...kimp?

Posted by fakebenjay on June 4, 2017

This week’s blog post is the first of hopefully many that will chart the progress of my campaign finance webapp, tentatively called “Follow The Money.” You can see the (very early and nowhere near complete) source code here.

Right now, the plan is to use data from OpenSecrets.org, the Center for Responsive Politics’ (CRP) campaign finance portal, which is classified and organized in a way that the raw FCC data that ProPublica’s API provides isn’t. The finished product will feature a slew of visualizations that hopefully contextualize numerous facets of the thorny world of campaign finance well, using React and D3 on the front end. On the back end, I’ve already built out a series of models in Rails for candidates, states, election cycles, donors, industries, and parties (plus a whopping 11 join tables to establish all the relationships between them.) But how will the front end access the Rails models? React will be making API calls to Rails in the hopes of getting JSON it can build into D3 visualizations.

But where will the JSON come from? The answer to that…is serializers (congratulate yourself with a bucket of discount shrimp at the Crab Crib if you’ve made it this far past the title.)

Serialization is the process of modifying data structures in a way that allows them to be more easily stored or transmitted. In our case, we want to take scraped candidate data, and convert it into JSON so React can easily parse it. To do that, we need to generate a serializer, which if we’re working with our “Candidate” model, is as easy as typing…

rails g serializer candidate

…which gives us a serializer file that looks something like this…

class CandidateSerializer < ActiveModel::Serializer
  attributes :id, :name, :state_id, :district, :office, :party_id 
end

This format will define the attributes that we’ll later find in our JSON.

Now if we want to set those attributes, we have to go to our controller, which looks like this… (obviously, this is a very early version. The final product will allow you to find data for more than one member of Congress)

class CandidatesController < ApplicationController
  def new
    @candidate = Candidate.new
    url = 'https://www.opensecrets.org/politicians/summary.php?cid=N00000001'
    html = open(url)
    raw_candidate = Nokogiri::HTML(html)

    @candidate.name = raw_candidate.css('#headInfo h1').text
    @candidate.state_id = State.find_by_abbreviation(raw_candidate.css('#headInfo #title').text.split(' ')[3][0..1]).id
    @candidate.district = '7'
    @candidate.office = raw_candidate.css('#headInfo #title').text.split(' ')[0]
    @candidate.party_id = Party.find_by_abbreviation(raw_candidate.css('#headInfo #title').text.split(' ')[1][1]).id

    if @candidate.save
      render json: @candidate, serializer: CandidateSerializer
    else
      render json: "There was an issue saving this candidate", status: 401
    end
  end
end

What our controller does is find all the necessary information on a candidate’s individual page, scrape it using Nokogiri, and use all of it to set our Candidate object’s attributes (note: OpenSecrets doesn’t list districts for individual House members, and I still haven’t figured out how/where to scrape that data. “7” is a dummy value here, corresponding to IL-07 in Chicago, where WBEZ, home of NPR’s hit shows Serial and This American Life, is based). Once our Candidate is saved, we very simply render its attributes as JSON.

render json: @candidate, serializer: CandidateSerializer

Our Candidate object is stored in the instance variable @candidate, and is converted to JSON based on the attributes in our CandidateSerializer! When we make an API call, our final candidate should look something like this…

{
  "candidate": {
    "id": 1,
    "name": "Sarah Koenig",
    "state_id": 13,
    "district": "7",
    "office": "Representative",
    "party_id": 1
  }
}