Home > BI, Code, Misc, g33k > Google Geocoding in SSIS

Google Geocoding in SSIS

I have an old YouTube tutorial on how to use Google Geocoding within SSIS, but I lost all the downloadable scripts.  I recently had a few people asking me if I could send them the scripts, so I deiceded to rewrite them.

Again, you can check out the YouTub video (it’s rough) for more details, just go to http://www.youtube.com/watch?v=HTSHzR-wSgc with this DISCLAIMER:  This code was just thrown together and is quite ugly and my microphone was terrible, I usually sound much better than that :-)  :-)

In a nutshell, I sniped about 18,000 golfcourses off the web with a PHP script and got basic course information for all of them, including the address information.  I wanted to use Google Maps to display these courses, so I thought I’d use Google Geocoding.  Caution, they don’t like it when you hit them 1000 times per second, so I would recommend using a Sleep command in your code like I did so you can throttle it without getting kicked off. 

So, I have an OLE Source task that reads from my database.  I created a Derived Column task to create placeholders for latitude, longitude, and level of accuracy.  I created a Script Component and selected street, city, state, and zip as input columns and added lat, long, and accuracy as read/write columns. 

Open the script editor, add a reference to System.Web, then create a new class called Google GeoCode, or whatever the heck you want to call it and paste in this code:

Option Strict Off

''' Might need this, might not

Imports System

Imports System.Data

Imports System.Math

Imports System.Net

Imports System.Web

Imports System.IO

Namespace GoogleGeoCoder

    Public Interface ISpatialCoordinate

        Property latitude() As Double

        Property longitude() As Double

    End Interface

    ''' <summary>   

    ''' Coordiate structure. Holds Latitude and Longitude.   

    ''' </summary>   

    Public Structure Coordinate

        Implements ISpatialCoordinate

 

        Private _latitude As Double

        Private _longitude As Double

 

        Public Sub New(ByVal lattitude As Double, ByVal longitude As Double)

            _latitude = lattitude

            _longitude = longitude

        End Sub

 

        Public Property latitude() As Double Implements ISpatialCoordinate.latitude

            Get

                Return _latitude

            End Get

            Set(ByVal value As Double)

                _latitude = value

            End Set

        End Property

 

        Public Property longitude() As Double Implements ISpatialCoordinate.longitude

            Get

                Return _longitude

            End Get

            Set(ByVal value As Double)

                _longitude = value

            End Set

        End Property

 

    End Structure

    Public Class GeoCode

        Const _googleUri As String = "http://maps.google.com/maps/geo?q="

        Const _googleKey As String = "yourgooglekeyhere"  ' get your own       

        Const _outputType As String = "csv"

        Private Shared Function GetGeoCodeUri(ByVal address As String) As Uri

            address = HttpUtility.UrlEncode(address)

            Return New Uri(String.Format("{0}{1}&output={2}&key={3}", _googleUri, address, _outputType, _googleKey))

        End Function

        ''' <summary>       

        ''' Gets a Coordinate from a address.       

        ''' </summary>       

        ''' <param name="address">An address.       

        ''' <remarks>       

        ''' <example>1600 Amphitheatre Parkway Mountain View, CA 94043</example>       

        ''' </remarks>       

        ''' </param>       

        ''' <returns>A spatial coordinate that contains the latitude and longitude of the address.</returns>       

        Public Shared Function GetCoordinates(ByVal address As String) As Coordinate

            Dim client As WebClient = New WebClient()

            Dim uri As Uri = GetGeoCodeUri(address)

            Dim geoCodeInfo As String()

            'The first number is the status code,            

            'the second is the accuracy,            

            'the third is the latitude,            

            'the fourth one is the longitude.           

            Try

                geoCodeInfo = client.DownloadString(uri).Split(",")

                Return New Coordinate(Convert.ToDouble(geoCodeInfo(2)), Convert.ToDouble(geoCodeInfo(3)))

            Catch ex As Exception

                Return New Coordinate(0.0, 0.0)

            End Try

        End Function

    End Class

End Namespace

 

Save it and then go back to the main script and add Import that class (see code).  Notice, the way I’m getting the level of accuracy with my ugly nested IF statements isn’t the prettiest way to do this, use your imagination.  Copy in this code into the Main Script:

 

' Microsoft SQL Server Integration Services user script component

' This is your new script component in Microsoft Visual Basic .NET

' ScriptMain is the entrypoint class for script components

 

Imports System

Imports System.Data

Imports System.Math

Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper

Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

Imports ScriptComponent_4a8732cababc49a999159a91c737082a.GoogleGeoCoder

 

Public Class ScriptMain

    Inherits UserComponent

 

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

 

        Dim coordinate As Coordinate = GeoCode.GetCoordinates(Row.street & ", " & Row.city & ", " & Row.state & ", " & Row.zip)

 

        If ((coordinate.latitude <> 0) And (coordinate.longitude <> 0)) Then

            Row.lat = Convert.ToDecimal(coordinate.latitude)

            Row.long = Convert.ToDecimal(coordinate.longitude)

            Row.acc = 1

        Else

            coordinate = GeoCode.GetCoordinates(Row.city & ", " & Row.state & ", " & Row.zip)

            If ((coordinate.latitude <> 0) And (coordinate.longitude <> 0)) Then

                Row.lat = Convert.ToDecimal(coordinate.latitude)

                Row.long = Convert.ToDecimal(coordinate.longitude)

                Row.acc = 2

            Else

                coordinate = GeoCode.GetCoordinates(Row.zip)

                If ((coordinate.latitude <> 0) And (coordinate.longitude <> 0)) Then

                    Row.lat = Convert.ToDecimal(coordinate.latitude)

                    Row.long = Convert.ToDecimal(coordinate.longitude)

                    Row.acc = 3

                Else

                    coordinate = GeoCode.GetCoordinates(Row.city & ", " & Row.state)

                    If ((coordinate.latitude <> 0) And (coordinate.longitude <> 0)) Then

                        Row.lat = Convert.ToDecimal(coordinate.latitude)

                        Row.long = Convert.ToDecimal(coordinate.longitude)

                        Row.acc = 4

                    Else

                        coordinate = GeoCode.GetCoordinates(Row.state)

                        If ((coordinate.latitude <> 0) And (coordinate.longitude <> 0)) Then

                            Row.lat = Convert.ToDecimal(coordinate.latitude)

                            Row.long = Convert.ToDecimal(coordinate.longitude)

                            Row.acc = 5

                        End If

                    End If

                End If

            End If

        End If

    End Sub

 

End Class

 

VN:F [1.9.1_1087]

Rating: 5.0/5 (2 votes cast)
VN:F [1.9.1_1087]
Rating: 0 (from 0 votes)
Google Geocoding in SSIS, 5.0 out of 5 based on 2 ratings
Categories: BI, Code, Misc, g33k Tags: , ,
  1. CJ
    November 22nd, 2009 at 20:38 | #1

    What did you have to do to ensure the coordinates do not lose precision in the database? When I run this, it works fine with the exception of truncating anything past the decimal.

    i.e., 39.0000000 instead of 39.123456, etc…

    Thanks!

    VA:F [1.9.1_1087]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.1_1087]
    Rating: 0 (from 0 votes)
  2. January 29th, 2010 at 17:10 | #2

    Hmmm, that didn’t happen for me. Something must be converting it to an integer either in the code or in the database. It’s almost impossible to debug script code in SSIS, but you can pull that code out and put it in a standalone .Net console app and see if it’s being rounded in the script code, but you should start by checking the datatypes in the table.

    VN:F [1.9.1_1087]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.1_1087]
    Rating: 0 (from 0 votes)
  1. No trackbacks yet.