| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Freebase - Semantic Web

Page history last edited by Boone 14 years, 10 months ago

Freebase http://www.freebase.com/

 

Raymond Yee: http://thatcamp.org/2009/06/how-to-make-freebase-useful-in-the-digital-humanities/

 

Freebase is like a Wikipedia for discreet facts

 

"The hard work of reconciliation": tying together various facts in the Freebase database under a single name. E.g.: Arnold Schwarzenegger == The Governator == Ahnold

  • How do you automate the process? LCHS data, etc. 
  • Maybe you have to have a real person do it: Amazon's Mechanical Turk

 

dbpedia can do some of the work. It strips structured data from the infoboxes on Wikipedia pages and dumps into RDF formats, etc. Can pose a problem when you don't necessarily trust the way those infoboxes are structured - are they consistently enough structured to automate the stripping?

 

The issue of reconciliation becomes the point of intersection for Freebase and the Semantic Web: setting up a canonical identifier for each object, and then identifying matched aliases.

 

Patrick is demoing his course identification app, where canonical identifiers are looked up in order to link a course with books read, topics discussed, university name, etc.

  • The goal is to create a catalog that can be searched or sorted by components of the course; for people to find courses they'd be interested in
  • The application to the humanities proper comes when you start doing higher-level analysis of the course data in order to look for patterns, e.g. in books assigned

 

Another way to think about the connections that this sort of semantic/freebase databasing can have to the actual practices of teaching and research: Take the specific, relatively limited data in an application like (for example) http://smarthistory.org and plot it against a larger set of data to make more generalized conclusions about influence between artists, similarities between works, etc.

 

http://www.nines.org/ is an example of a group of scholars (in Victorian lit, American lit) getting together to decide on a structured ontology to slice through the properties of their respective corpuses.

  • The problem, though, is that you have to find the right level of generalization. At first they found themselves lumping a great number of different kinds of poetry genres together as merely 'poetry', so that they would actually get results with early queries; but as a result it ends up being less interesting as a way of making searches.
  • After a huge amount of labor on behalf of these scholars, they couldn't really find good uses for the structured data (or rather couldn't get other scholars to find good uses)
  • This suggests that there's a real division between the structured data geeks and the scholars who are supposed to use that data. How do they get together to decide on the kind of interface, exemplars, purposes that the data will be used for?

 

Deciding today on categorizations or canonical names for categories can be problematic. (In philosophical terms! Extensionally equivalent predicates aren't necessarily intensionally equivalent.) It might be that "Negroes" refers to the same group of people as "African-Americans", but for historians it's not acceptable to have the two linked as equivalent - there are important differences. [Can you have higher-order linking? These category terms might also be object terms on a higher level?]

  • A solution to this kind of problem might be to look at the relationship of connectedness between taxonomic categories rather than equivalencies. Scholars should be able to decide for themselves whether categories collapse into each other. 

 

 

 

 

 

Comments (0)

You don't have permission to comment on this page.