Consultas en bases de datos

30 min | Última modificación: Diciembre 10, 2020

http://www.nltk.org/book/

Text Analytics with Python

Considere la siguiente tabla en una base de datos:

City        Country  Population
-------------------------------
athens      greece         1368
bangkok     thailand       1178
barcelona   spain          1280
berlin      germany        3481
birmingham  united_kindom  1112

¿Cómo puede procesarse la siguiente pregunta en SQL?

Which country is athens in?

RTA/

SELECT Country FROM city_table WHERE City = 'athens'

Explique la siguiente gramática:

[1]:

%%writefile sql0.fcfg
% start S

S[SEM=(?np + WHERE + ?vp)] -> NP[SEM=?np] VP[SEM=?vp]

VP[SEM=(?v + ?pp)] -> IV[SEM=?v] PP[SEM=?pp]

VP[SEM=(?v + ?ap)] -> IV[SEM=?v] AP[SEM=?ap]

NP[SEM=(?det + ?n)] -> Det[SEM=?det] N[SEM=?n]

PP[SEM=(?p + ?np)] -> P[SEM=?p] NP[SEM=?np]

AP[SEM=?pp] -> A[SEM=?a] PP[SEM=?pp]

NP[SEM='Country="greece"'] -> 'Greece'

NP[SEM='Country="china"'] -> 'China'

Det[SEM='SELECT'] -> 'Which' | 'What'

N[SEM='City FROM city_table'] -> 'cities'

IV[SEM=''] -> 'are'

A[SEM=''] -> 'located'

P[SEM=''] -> 'in'

Overwriting sql0.fcfg

[2]:

from nltk import load_parser

cp = load_parser('sql0.fcfg')
query = 'What cities are located in China'
trees = list(cp.parse(query.split()))
answer = trees[0].label()['SEM']
answer = [s for s in answer if s]
q = ' '.join(answer)
print(q)

SELECT City FROM city_table WHERE Country="china"

Ejercicio.— Extienda la gramatica para que el sistema pueda interpretar la siguiente pregunta:

What cities are in China and have populations above 1.000.000?

y genere como respuesta la siguiente clausula de SQL:

SELECT City FROM city_table WHERE Country = 'china' AND Population > 1000

[ ]:

[3]:

text = '''


>>> from nltk import load_parser
>>> cp = load_parser('grammars/book_grammars/sql0.fcfg')
>>> query = 'What cities are located in China'
>>> trees = list(cp.parse(query.split()))
>>> answer = trees[0].label()['SEM']
>>> answer = [s for s in answer if s]
>>> q = ' '.join(answer)
>>> print(q)
'''


text = text.replace(">>> ", "").replace("... ", "").replace("...", "").replace("\t", "")
print(text)




from nltk import load_parser
cp = load_parser('grammars/book_grammars/sql0.fcfg')
query = 'What cities are located in China'
trees = list(cp.parse(query.split()))
answer = trees[0].label()['SEM']
answer = [s for s in answer if s]
q = ' '.join(answer)
print(q)