Python|【题解】Musical Track Database (Using Databases with Python)

吐槽:额,忘了当时为啥没记录了,大概是偷懒解法就没写题解,然后今天发现有人在问,就顺便写一下我的方法吧。当时满分的时候是偷了个懒,今儿研究了一下顺便把不偷懒的方法也整出来了。有更好的也欢迎留言啦。
题目:Musical Track Database
This application will read an iTunes export file in XML and produce a properly normalized database with this structure:

CREATE TABLE Artist ( idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, nameTEXT UNIQUE ); CREATE TABLE Genre ( idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, nameTEXT UNIQUE ); CREATE TABLE Album ( idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, artist_idINTEGER, titleTEXT UNIQUE ); CREATE TABLE Track ( idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, title TEXTUNIQUE, album_idINTEGER, genre_idINTEGER, len INTEGER, rating INTEGER, count INTEGER );

If you run the program multiple times in testing or with different files, make sure to empty out the data before each run.
You can use this code as a starting point for your application: http://www.py4e.com/code3/tracks.zip. The ZIP file contains the Library.xml file to be used for this assignment. You can export your own tracks from iTunes and create a database, but for the database that you turn in for this assignment, only use the Library.xml data that is provided.
To grade this assignment, the program will run a query like this on your uploaded database and look for the data it expects to see:
SELECT Track.title, Artist.name, Album.title, Genre.name FROM Track JOIN Genre JOIN Album JOIN Artist ON Track.genre_id = Genre.ID and Track.album_id = Album.id AND Album.artist_id = Artist.id ORDER BY Artist.name LIMIT 3

The expected result of the modified query on your database is: (shown here as a simple HTML table with titles)
Track Artist Album Genre
Chase the Ace AC/DC Who Made Who Rock
D.T. AC/DC Who Made Who Rock
For Those About To Rock (We Salute You) AC/DC Who Made Who Rock
我的解法1(偷懒解法):
import xml.etree.ElementTree as ET
import sqlite3
conn = sqlite3.connect('trackdb.sqlite')
cur = conn.cursor()
# Make some fresh tables using executescript()
cur.executescript('''
DROP TABLE IF EXISTS Artist;
DROP TABLE IF EXISTS Album;
DROP TABLE IF EXISTS Track;
DROP TABLE IF EXISTS Genre;
CREATE TABLE Artist (
idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
nameTEXT UNIQUE
);
CREATE TABLE Genre (
idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
nameTEXT UNIQUE
);
CREATE TABLE Album (
idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
artist_idINTEGER,
titleTEXT UNIQUE
);
CREATE TABLE Track (
idINTEGER NOT NULL PRIMARY KEY
AUTOINCREMENT UNIQUE,
title TEXTUNIQUE,
album_idINTEGER,
genre_idINTEGER,
len INTEGER, rating INTEGER, count INTEGER
);
''')

fname = input('Enter file name: ')
if ( len(fname) < 1 ) : fname = 'Library.xml'
# Track ID369
# NameAnother One Bites The Dust
# ArtistQueen
def lookup(d, key):
found = False
for child in d:
if found : return child.text
if child.tag == 'key' and child.text == key :
found = True
return None
stuff = ET.parse(fname)
all = stuff.findall('dict/dict/dict')
print('Dict count:', len(all))
for entry in all:
if ( lookup(entry, 'Track ID') is None ) : continue
name = lookup(entry, 'Name')
artist = lookup(entry, 'Artist')
album = lookup(entry, 'Album')
count = lookup(entry, 'Play Count')
rating = lookup(entry, 'Rating')
length = lookup(entry, 'Total Time')
if name is None or artist is None or album is None :
continue
print(name, artist, album, count, rating, length)
cur.execute('''INSERT OR IGNORE INTO Artist (name)
VALUES ( ? )''', ( artist, ) )
cur.execute('SELECT id FROM Artist WHERE name = ? ', (artist, ))
artist_id = cur.fetchone()[0]
cur.execute('''INSERT OR IGNORE INTO Album (title, artist_id)
VALUES ( ?, ? )''', ( album, artist_id ) )
cur.execute('SELECT id FROM Album WHERE title = ? ', (album, ))
album_id = cur.fetchone()[0]
cur.execute('''INSERT OR REPLACE INTO Track
(title, album_id, len, rating, count)
VALUES ( ?, ?, ?, ?, ? )''',
( name, album_id, length, rating, count ) )
conn.commit()
在运行完以上代码之后生成了那个trackdb.sqlite,然后就用DB Browser打开,发现表里面数据大概都对,除了那个genre的表,那个表里面没东西,是因为我没修改后面的代码(我只改了CREATE TABLE那部分的),所以没有写入东西,偷懒的话完全可以只给这个表一行数据(1, Rock),之后给Track表里面的genre_id都设置为1(执行个SQL指令),这样整理完之后再运行那段SQL指令就一定可以生成一个符合条件的表,不过不是正解,但可以满分。
我的解法2(正确解法):
import xml.etree.ElementTree as ET
import sqlite3
【Python|【题解】Musical Track Database (Using Databases with Python)】conn = sqlite3.connect('trackdb.sqlite')
cur = conn.cursor()
# Make some fresh tables using executescript()
cur.executescript('''
DROP TABLE IF EXISTS Artist;
DROP TABLE IF EXISTS Album;
DROP TABLE IF EXISTS Track;
DROP TABLE IF EXISTS Genre;
CREATE TABLE Artist (
idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
nameTEXT UNIQUE
);
CREATE TABLE Genre (
idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
nameTEXT UNIQUE
);
CREATE TABLE Album (
idINTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
artist_idINTEGER,
titleTEXT UNIQUE
);
CREATE TABLE Track (
idINTEGER NOT NULL PRIMARY KEY
AUTOINCREMENT UNIQUE,
title TEXTUNIQUE,
album_idINTEGER,
genre_idINTEGER,
len INTEGER, rating INTEGER, count INTEGER
);
''')

fname = input('Enter file name: ')
if ( len(fname) < 1 ) : fname = 'Library.xml'
# Track ID369
# NameAnother One Bites The Dust
# ArtistQueen
def lookup(d, key):
found = False
for child in d:
if found : return child.text
if child.tag == 'key' and child.text == key :
found = True
return None
stuff = ET.parse(fname)
all = stuff.findall('dict/dict/dict')
print('Dict count:', len(all))
for entry in all:
if ( lookup(entry, 'Track ID') is None ) : continue
name = lookup(entry, 'Name')
artist = lookup(entry, 'Artist')
album = lookup(entry, 'Album')
genre = lookup(entry, 'Genre')
count = lookup(entry, 'Play Count')
rating = lookup(entry, 'Rating')
length = lookup(entry, 'Total Time')
if name is None or artist is None or album is None or genre is None:
continue
print(name, artist, album, genre, count, rating, length)
cur.execute('''INSERT OR IGNORE INTO Artist (name)
VALUES ( ? )''', ( artist, ) )
cur.execute('SELECT id FROM Artist WHERE name = ? ', (artist, ))
artist_id = cur.fetchone()[0]
cur.execute('''INSERT OR IGNORE INTO Genre (name)
VALUES ( ? )''', ( genre, ) )
cur.execute('SELECT id FROM Genre WHERE name = ? ', (genre, ))
genre_id = cur.fetchone()[0]
cur.execute('''INSERT OR IGNORE INTO Album (title, artist_id)
VALUES ( ?, ? )''', ( album, artist_id ) )
cur.execute('SELECT id FROM Album WHERE title = ? ', (album, ))
album_id = cur.fetchone()[0]

cur.execute('''INSERT OR REPLACE INTO Track
(title, album_id, genre_id, len, rating, count)
VALUES ( ?, ?, ?, ?, ?, ? )''',
( name, album_id, genre_id, length, rating, count ) )

conn.commit()
就是往应该执行的地方照猫画虎的加入一个genre和genre_id罢了,之后在执行肯定就是正解咯。
我想说的:
1,偷懒解法是挺好混分的,不过还是得抽空想想正确解法啦。
2,如果不是很清楚老师的代码为什么这么写,可以打开源文件看下(就是那个.xml),发现所谓 all = stuff.findall('dict/dict/dict') 就是到曲子并列的位置,然后剩下的就是提取每个曲子的要素罢了。

    推荐阅读