Corso di Laurea Magistrale in
Ingegneria Informatica
Data Mining (6 CFU - 48 ore)
A.A. 2016/2017

PROGETTO

Objectives: (a) application of data mining techniques learned in the course to the analysis of some real-world dataset or to the development of efficient big-data processing tools; and (b) practical experience with Spark

Groups: the project is carried out by groups of at most 5 students (ideally, 4-5 students per group). We call early groups, those groups who complete the project by the end of the course, presenting it to the class during the last 2-3 lectures, and late groups the others.

Types of projects:

  • Suggested project: Only early groups can do this type of project. Details on the project are found here
  • Free project: groups are welcome to come up with their own proposal for a project and to submit to the teachers for approval. Both early and later groups can do this type of project. Here are some ideas for free projects

    • Spark implementation and test of association analysis methods (e.g., association rules, frequent closed itemsets, top-k frequent (closed) itemsets, etc.). In particular, check what is missing or not satisfactory in the currently available Spark libraries.
    • Analysis of some real-world data set. Datasets and/or inspiration can be gathered from the following sites:

      UC Irvine Machine Learning Repository

      Kaggle Competitions

Deadlines for early groups:

  • 30/4: Send an email to dmcourse@dei.unipd.it specifying: members of group, contact email (only one!), and type of project. In case of a free project, attach a 1-page description of the project
  • 04/6: Send an email to dmcourse@dei.unipd.it attaching a zipped folder containing the code you developed (with a README file), and two pdf files: a report on the project (max 7 pages); and a presentation (10min presentation, max 12 slides). (The datasets used should not be attached to the email but should be made available upon request.)

Deadlines for late groups:

  • 11/6: Send an email to dmcourse@dei.unipd.it specifying: members of group, contact email (only one!). Also 1-page description of the proposed project.
  • Day when first group member takes written exam: Send an email to dmcourse@dei.unipd.it attaching a zipped folder containing the code you developed (with a README file) and a report in pdf on the project (max 7 pages). (The datasets used should not be attached to the email but should be made available upon request.) Each group member will have to briefly discuss the project at the oral exam.

Structure of the report/presentation: The report and, for early groups, the presentation, (either in Italian or in English) should be structured as follows:

TITLE. Title of the project, and names+student ID of the group members. For suggested projects use the title "Suggested Project Goal A/B", for free projects come up with a meaningful title

1. DATASET and OBJECTIVEs. Describe the dataset and the *specific objectives* pursued in the project. For free projects, some information about the context which the data refer to may be useful.

2. DESCRIPTION of the ACTUAL WORK DONE. Describe the various steps performed during the project.

3. RESULTs. Summarize the main results of the analysis. Here, some tables and graphs are much helpful

4. CONCLUSIONs (optional). Write any final remarks and comments you have.

Groups can adapt the above structure as they wish, as long as they do not exceed 7 pages (using a "human readable" font size), for the report, and 12 slides for the 10-min presentation.

Evaluation: The evaluation of the project (25% of the final grade) will be based on: report, presentation (in class or at the oral exam), rigorousness of methodology, effectiveness/performance. Early groups who present their work at the end of the course will receive 1 extra point added to the final grade


Ultimo aggiornamento: 22 maggio 2017 Vai alla pagina iniziale