approved
RIMA

RIMA (Rapping in Italy: Music and Artists) is a structured dataset that includes information and features related to the songs of 104 Italian rap artists, both male and female. Designed in a database-like format, it supports research in fields such as natural language processing, music information retrieval, sociolinguistics, and digital humanities. Each artist in the dataset is assigned a unique identifier, stored in a central indexing file that ensures consistent cross-referencing across all components. A separate metadata file provides demographic and geographic details about the artists, such as gender, birth year, years active, and place of origin. These data are primarily sourced from Wikipedia. For each artist, the dataset contains their full discography, where each song is represented by 199 features. These features are organized into multiple categories. Metadata fields include internal identifiers as well as external IDs from platforms like Genius and Spotify. The lyrics of each song are included in full, accompanied by a rich set of linguistic features mainly extracted using the Profiling-UD framework. These cover syntactic and morphological properties, lexical statistics, and explicit information on the presence and list of profanities in both Italian and English. In addition, audio-derived features are provided, describing aspects such as tempo, signal amplitude, rhythmic patterns, and energy dynamics. By integrating textual, linguistic, and acoustic data, RIMA enables a wide range of analyses, including artist profiling, stylistic and sociolinguistic studies, gender-based comparisons, and multimodal modeling of musical and lyrical content within the Italian rap genre.

Tags
Data and Resources
To access the resources you must log in
  • RIMA%20v1ZIP

    RIMA (Rap Italiano: Music and Artists) is a structured dataset that includes...

    The resource: 'RIMA%20v1' is not accessible as guest user. You must login to access it!
Personal Data Attributes

Description: Personal Data related Information

Field Value
Anonymised No
ChildrenData No
Cross Border Authorised Yes
Ethics Committee Approval (if not Sensitive Data) No
General Data Yes
Personal Data No
Personal data was manifestly made public by the data subject No
Sensitive Data No
Additional Info
Field Value
Accessibility Both
Associate Project FAIR
Basic rights Download
Creation Date 2025-06-30 17:15
Creator Setzu, Mattia, mattia.setzu@unipi.it, orcid.org/0000-0001-8351-9999
Creator Pollacci, Laura, laura.pollacci@unipi.it, orcid.org/0000-0001-9914-1943
Data sharing agreement yes
Dataset Citation Pollacci, L., & Setzu, M. (2025). RIMA (Rap Italiano: Music and Artists): A Multimodal Dataset for Computational Analysis of Italian Rap Music.
Field/Scope of use Non-commercial research only
Group Societal Debates and Misinformation
Language ita, Italian
License term 2025-06-30 17:15/2055-07-07 17:15
Processing Degree Primary
SoBigData Node SoBigData IT
Sublicense rights No
Territory of use World Wide
Thematic Cluster Text and Social Media Mining [TSMM]
system:type Dataset
Management Info
Field Value
Author POLLACCI LAURA
Maintainer POLLACCI LAURA
Version 1
Last Updated 22 July 2025, 04:48 (CEST)
Created 22 July 2025, 04:48 (CEST)