Excel Convert Text To Columns Adress field with 370 total records
I have a small excel document where one column contains mailing addresses in column G (OTHER_POSSIBLE_ADDRESS) in the following format:
CRINITI JOSEPH 2509 S 20TH ST PHILADELPHIA PA 19145
And I would like to have this column broken into separate columns in the following format:
LAST NAME (CRINITI),
FIRST NAME (JOSEPH),
STREE ADDRESS (2509 S 20TH ST),
The challenge is that not all records have the exact same number of words representing street. Some are missing spaces. Some have middle names, some not.
There are 370 records total. This work can be done manually or via with an algorithm.
If you apply, please indicate whether you plan to work manually or build an algorithm and please see the instructions below corresponding to ALGORITHM or MANUAL.
There are potentially many more records to be parsed in this fashion. An algorithm that can automatically do this sorting would be of tremendous value, but if the task is too onerous for 370 records, please sort this set manually.
I can give you a set of rules to make it easier. For instance, taking row 15 as an example,
C/O M FERRARA 815 FEDERAL ST PHILADELPHIA PA 19147-5006
Rule 1: the string should be split before the last set of characters "19147-5006" which is ZIP(). Then shortened to 5 characters "19147"
Rule 2: The string should be split before the two characters that represent US State "PA" in this case, but please look at the list of all US State abbreviations: http://en.wikipedia.org/wiki/List_
Rule 3: Split the string before city "PHILADELPHIA" in this case, but it can be a different city: http://en.wikipedia.org/wiki/List_
I would make sure that the algorithm checks city but also checks state to make sure the string isn't spit by a street name that happens to match a city name.
Rule 4: This is trickier, I would split before the numbers AND after the abbreviation for "Street" "ST" -- There can be any number of abbreviations to signal the end of this phrase (see the following list)
Ct. = Court
Ln. = Lane
St. = Street
Rd. = Road
Terr. = Terrace
Cir. = Circle
Hwy. = Highway
Pkwy. = Parkway
Ave. = Avenue
Way = Way (no abbreviation)
Rdg. = Ridge
Blvd. = Boulevard
Dr. = Drive
Rule 5: In this example "C/O M FERRARA" the "C/O" should be ignored because it stands for "Care Of." "M" is first initial of FIRST_NAME and "FERRARA" is LAST_NAME.
Rule 5 is going to be the most challenging to accomplish because FIRST_NAME is sometimes before and sometimes after LAST_NAME. Sometimes there's an initial for FIRST_NAME, sometimes the full FIRST_NAME, sometimes an initial for MIDDLE_NAME, sometimes nothing at all for MIDDLE_NAME.
FOR MANUAL WORKERS
Please tell me how long you believe it will take to perform the entire job?
Please break down the following 5 records manually to demonstrate your understanding of the job:
KAUFMAN NECHAMA P O BOX 51215 PHILADELPHIA PA 19115-0215
MARTIN M TRENDLER401 S WATTS ST Philadelphia PA 19147-1119
POYSER FELIX ETAL TRS OF 1219-21 FITZWATER ST PHILADELPHIA PA 19147-2501
C/O WELKER ADP REALTY 401 S 2ND ST PHILADELPHIA PA 19147-1612
LOURDES FERNANDEZ S/W 532 GREENWICH ST Philadelphia PA 191476525