摘要:
What factors determine the impact of a scientific paper? Does its impact extend to patents and software development, or is it primarily confined to academic circles? While current literature predominantly adopts a descriptive approach, emphasizing patent citations as indicators of industry impact, the role of research in driving software development is overlooked. To address this gap, we quantitatively assessed the impact of research papers on both patents and software repositories. With a computational social science approach, we collected, curated, and analyzed a large-scale dataset of 200K papers published between 1980 and 2022 across the research areas of AI, Computer Vision, Data Mining, Databases, HCI, and NLP, including conferences like NeurIPS, ICML, ACL, CVPR, CHI, KDD, and The Web Conference. We found that, on average, 7.1% of papers from these venues became patents and 11.6% went into repositories—significantly higher than top general science journals (3.8% for patents and 0.02% for repositories). Despite being a minority, these papers have received a disproportionate number of citations—4% of AI papers became patents, and 18% went into repositories, yet they have received 29% and 42% of the area’s academic citations, respectively. However, after correcting for papers published at different times with survival analysis, we found that there is a significant time lag between patents or repositories and papers (10-15 years for patents, 5 years for repositories in Computer Vision and NLP, and even longer for top general science journals at 30 years). As for consistent trends, Deep Learning has become exponentially popular, and “papers with code” are becoming the norm. Finally, we showed that a paper’s publication venue and the extent to which a paper builds upon (un)conventional knowledge determine the impact on patents and repositories, with greater conventionality predicting impact on patents, and lesser conventionality predicting impact on repositories.