Theres a python in my catalog!

Here is another program installment.
This one solves a need I had at work, taking a pipe delimited file from a library program and extracting the publication year, call number and title field to output as one pipe delimited line each. So it could be opened in excel .

I had to step outside the current commands talked about but I tried to stay close. Hope I added enough comments to make it clear what I am up to.

The file is 8500 line long and my first try only found 150. 🙁
I had tried to use with and had something wrong.
I finally settled on the “for Line in record:” to loop and that worked.

The file is on github.
There is also a file catalog2.txt.
that has two of the records for testing.
you will have to change the file locations to match where you have put the file.

Below is the program.

# Created on a win system with IDLE
# Practice python script project 2014-003.py

# A script to pull in an output file with pipe delimited values, with a varied number
# of items per Title record.
# list of unused title to create a smaller list of title with the format
# Publication Year | Call# | Title

# Open file for read only access, change the filename to your needs
record = open(“c:\data\catalog.txt”, “r”)

# Open a file for output, write access, change the filename to your needs
output = open(“c:\data\export.txt”,”w”)

# Setup the empty lists

Title = [] # The Title of the item(s)

T = int(0) # A counter for the end

Item = [] # The item ID of the item(s)

Callnumber = [] # The Call Number of the items(s)

C = int(0) # A counter for the Call numbers

Pubyear = [] # The Publication Year of the Title

Finished = [] # Used to put together the data for export

Working = [] # The entire input from one file read, split by “|”

# Setup loop to read through entire file for processing
# Loads a line of text from the object record into lineuntil the file end

for Line in record:

# Split the record into a list called working by the pipe symbol
Working = Line.split(“|”)

# Find the location of ITEM tags, not used at this time
for x in range(0,len(Working)):
if Working[x] == “ITEM>”:
Item.append(x)

# Find the location of the Call number tags
for x in range(0,len(Working)):
if Working[x] == “CALLNUM>”:
Callnumber.append(Working[x+1])# Load the Call number
C=C+1 # Totaling the call numbers

# Find the location of the Title tag
for x in range(0,len(Working)):
if Working[x] == “Title”:
Title.append(Working[x+1])# Load the Title
Pubyear.append(Working[x-2])# load the Publication year
T= T+1 # Totaling the titles

# Pack the finished list for each Callnumber in the record, may be one or more
for x in range(0,len(Callnumber)):
Finished.append(Pubyear)
Finished.append(Callnumber[x])
Finished.append(Title)

# Prepare the final output string for wrting
export = str(“{}|{}|{}|\n”).format(Finished[0],Finished[1],Finished[2])
output.write(export)# Write the one item to the files
Finished = []# Clear the list for the next item
Title = []# Clear the Title list
Pubyear = []# Clear the Publication year list
Callnumber = []# Clear the Call Number list
print “Processed #”,T # To indicate work is being done.
# Indicate count of processed records for quality control

print T,” Titles processed”
print C,” Call Numbers processed”

# Close the files and end

output.close()
record.close()

raw_input() # To keep the window open on gui systems

About jcoffey

https://profiles.google.com/jerry.coffey/posts
This entry was posted in Python and tagged . Bookmark the permalink.