Rs y Dm

Páginas: 63 (15674 palabras) Publicado: 21 de septiembre de 2011
Chapter 2

Data Mining Methods for Recommender Systems
Xavier Amatriain, Alejandro Jaimes, Nuria Oliver, and Josep M. Pujol

Abstract In this chapter, we give an overview of the main Data Mining techniques used in the context of Recommender Systems. We first describe common preprocessing methods such as sampling or dimensionality reduction. Next, we review the most important classificationtechniques, including Bayesian Networks and Support Vector Machines. We describe the k-means clustering algorithm and discuss several alternatives. We also present association rules and related algorithms for an efficient training process. In addition to introducing these techniques, we survey their uses in Recommender Systems and present cases where they have been successfully applied.

2.1Introduction
Recommender Systems (RS) typically apply techniques and methodologies from other neighboring areas – such as Human Computer Interaction (HCI) or Information Retrieval (IR). However, most of these systems bear in their core an algorithm that can be understood as a particular instance of a Data Mining (DM) technique. The process of data mining typically consists of 3 steps, carried out insuccession: Data Preprocessing [59], Data Analysis, and Result Interpretation (see Figure 2.1). We will analyze some of the most important methods for data preprocessing in Section 2.2. In particular, we will focus on sampling, dimensionality reduction, and the use of distance functions because of their significance and their role in RS. In Sections 2.3 through 2.5, we provide an overview introductionto the data mining methods that are most commonly used in RS: classification, clustering and associaXavier Amatriain Telefonica Research, Via Augusta, 122, Barcelona 08021, Spain e-mail: xar@tid.es Alejandro Jaimes Yahoo! Research, Av.Diagonal, 177, Barcelona 08018, Spain. Work on the chapter was performed while the author was at Telefonica Research. e-mail: ajaimes@yahoo-inc.com Nuria OliverTelefonica Research, Via Augusta, 122, Barcelona 08021, Spain e-mail: nuriao@tid.es Josep M. Pujol Telefonica Research, Via Augusta, 122, Barcelona 08021, Spain e-mail: jmps@tid.es

F. Ricci et al. (eds.), Recommender Systems Handbook, DOI 10.1007/978-0-387-85820-3_2, © Springer Science+Business Media, LLC 2011

39

40

Xavier Amatriain, Alejandro Jaimes, Nuria Oliver, and Josep M. Pujoltion rule discovery (see Figure 2.1 for a detailed view of the different topics covered in the chapter).

Fig. 2.1: Main steps and methods in a Data Mining problem, with their correspondence to chapter sections. This chapter does not intend to give a thorough review of Data Mining methods, but rather to highlight the impact that DM algorithms have in the RS field, and to provide an overview of thekey DM techniques that have been successfully used. We shall direct the interested reader to Data Mining textbooks (see [28, 73], for example) or the more focused references that are provided throughout the chapter.

2.2 Data Preprocessing
We define data as a collection of objects and their attributes, where an attribute is defined as a property or characteristic of an object. Other names forobject include record, item, point, sample, observation, or instance. An attribute might be also be referred to as a variable, field, characteristic, or feature.

2 Data Mining Methods for Recommender Systems

41

Real-life data typically needs to be preprocessed (e.g. cleansed, filtered, transformed) in order to be used by the machine learning techniques in the analysis step. In this section,we focus on three issues that are of particular importance when designing a RS. First, we review different similarity or distance measures. Next, we discuss the issue of sampling as a way to reduce the number of items in very large collections while preserving its main characteristics. Finally, we describe the most common techniques to reduce dimensionality.

2.2.1 Similarity Measures
One of...
Leer documento completo

Regístrate para leer el documento completo.

Estos documentos también te pueden resultar útiles

  • DM
  • DM
  • RS Renfe
  • recomendación DM
  • jksc,dm
  • Rs-485
  • EPP RS
  • Dm S.A.

Conviértase en miembro formal de Buenas Tareas

INSCRÍBETE - ES GRATIS