The objective of our study was to systematically review the literature about the application of artificial intelligence (AI) to renal mass characterization with a focus on the methodologic quality items. A systematic literature search was conducted using PubMed to identify original research studies about the application of AI to renal mass characterization. Besides baseline study characteristics, a total of 15 methodologic quality items were extracted and evaluated on the basis of the following four main categories: modeling, performance evaluation, clinical utility, and transparency items. The qualitative synthesis was presented using descriptive statistics with an accompanying narrative. Thirty studies were included in this systematic review. Overall, the methodologic quality items were mostly favorable for modeling (63%) and performance evaluation (63%). Even so, the studies (57%) more frequently constructed their work on nonrobust features. Furthermore, only a few studies (10%) had a generalizability assessment with independent or external validation. The studies were mostly unsuccessful in terms of clinical utility evaluation (89%) and transparency (97%) items. For clinical utility, the interesting findings were lack of comparisons with both radiologists’ evaluation (87%) and traditional models (70%) in most of the studies. For transparency, most studies (97%) did not share their data with the public. To bring AI-based renal mass characterization from research to practice, future studies need to improve modeling and performance evaluation strategies and pay attention to clinical utility and transparency issues.