Sunday, February 28, 2010

Recitation #6 Slides

The slides for the sixth recitation (r6) are available through the link below:

http://www-student.cse.buffalo.edu/~dlessa/cse462-SP10/r6.pdf

Thursday, February 25, 2010

UPDATE: DBLP Practice Queries

The DBLP practice queries have been updated with minor syntax fixes to support the Oracle SQL dialect. In particular, because of the two non-standard features below:

1. Oracle does not support the keyword AS in FROM for aliasing, but you can specify aliases without AS.


2. The set difference relational operator in Oracle is MINUS rather than EXCEPT.

For other non-standard features in Oracle see:

http://infolab.stanford.edu/~ullman/fcdb/oracle/or-nonstandard.html

New slides: Conceptual Database Design

Wednesday, February 24, 2010

Clarification: Paper and Publication

Guys,

For the purposes of Project #1, paper and publication are synonyms and correspond to an entity in the DBLP_ENTITY table. For question 1.1, you should restrict your search to entities of type inproceedings, as explained in the last recitation (r5).

Demian

Monday, February 22, 2010

Top-K

The Java library file for the first project (p5) is available through the link below:

http://www-student.cse.buffalo.edu/~dlessa/cse462-SP10/topK.jar


About the library:

  • TopKImpl is the skeleton class that you must implement. It is a stub that implements the TopK interface. You may create any number of helper methods in this class.
  • ScoringFunction is the interface that defines the signature of a scoring function. A sample implementation of scoring function, ScoringFunctionImpl, is provided for you to test your implementation.
  • Tuple is a simple class that extracts and encapsulates a single tuple from a ResultSet object.
  • OraConnectionProxy is a helper class for you to connect to Oracle. Use if you like it. First, you must modify the user name and password in order to connect to Oracle.
  • Launcher has a main method that you can use to test your implementation. Two queries are provided with code calling your Top-K implementation. Test data is available to help you check if your algorithm is running correctly (the data has been commented out).

If you have any questions, feel free to send me an e-mail.

Recitation #5 Slides

The slides for the fifth recitation (r5) are available through the link below:

http://www-student.cse.buffalo.edu/~dlessa/cse462-SP10/r5.pdf


Observation about query 2.1 in the project.

I want to make this absolutely clear. Your query may return any number of rows satisfying the given criteria (it doesn't matter how many rows the query returns). However, the third row in the result must be the one containing the "author who published the THIRD smallest number of papers amongst the authors with more than 400 publications." This can only be achieved using the ORDER BY clause.

Thursday, February 18, 2010

Monday, February 15, 2010

Recitation #4 Slides

The slides for the fourth recitation (r4) are available through the link below:

http://www-student.cse.buffalo.edu/~dlessa/cse462-SP10/r4.pdf

The practice queries for DBLP are also available here:

http://www-student.cse.buffalo.edu/~dlessa/cse462-SP10/dblp-practice.sql

Before running the queries against the database don't forget to prepend all table names with my schema name, DLESSA, as explained in the slides. If you have problems connecting to Oracle or accessing the tables under my schema, please drop me an e-mail ASAP.

Thursday, February 11, 2010

Monday, February 8, 2010

DISTINCT in SQL

Some clarification is needed after the discussion of DISTINCT in the class today:

1. SELECT DISTINCT A1,...,An FROM...WHERE...GROUP BY...HAVING...
results in eliminating duplicates from the result of the query
SELECT A1,...,An FROM...WHERE...GROUP BY...HAVING...
I.e., duplicate elimination is the last step in SELECT query evaluation.

2. SELECT COUNT(DISTINCT A)...eliminates duplicates from the single column A before applying COUNT (the same for other aggregate functions except COUNT(*)).

3. DISTINCT cannot be used with COUNT(*). So if one wants to count the number of distinct tuples in a relation, one needs to define a view using SELECT DISTINCT and then compute COUNT(*) over the view.

Additional office hours before the test

I will be in my office tomorrow from 11am to 3pm, and Wednesday from 1:30 to 2:30pm.

--Jan Chomicki

Friday, February 5, 2010

Recitation #3 Slides

The slides for the third recitation (r3) are available through the link below:

http://www-student.cse.buffalo.edu/~dlessa/cse462-SP10/r3.pdf

Please, go over the queries in the slides and make sure you understand them. Make sure you also understand the solutions to the Advanced SQL Queries posted by Dr. Chomicki.

If you have questions, you may stop by any time during my office hours. If you cannot make to my office hours (or the professor's), shoot me an e-mail and I'll try to meet you some other time.

Monday, February 1, 2010

Java References

For those of you who asked me for some Java references, here they go:

Thinking in Java, 3rd edition, by Bruce Eckel. This is an excellent free online book. I strongly recommend:

http://www.mindview.net/Books/TIJ/

For a number of good video tutorials on both Java and Eclipse, visit:

http://eclipsetutorial.sourceforge.net/

Hope this helps!

Demian

Advanced SQL queries