-д хадгалсан:
Номзүйн дэлгэрэнгүй
Үндсэн зохиолч: Anonymous
Формат: Recurso digital
Хэл сонгох:
Хэвлэсэн: Zenodo 2026
Онлайн хандалт:https://doi.org/10.5281/zenodo.19201952
Шошгууд: Шошго нэмэх
Шошго байхгүй, Энэхүү баримтыг шошголох эхний хүн болох!
Агуулга:
  • <h1> LLM-based Requirements & User Story Assessment Tool</h1> <h3><strong>Live App:</strong> <a href="https://llmjudge4requirements.streamlit.app/" rel="nofollow">https://llmjudge4requirements.streamlit.app</a></h3> <p dir="auto">Check below for the demo video.</p> <p dir="auto">An AI-powered evaluation tool that uses large language models (LLMs) as judges to assess the quality of software requirements and user stories. Designed for academic use, research, and instructor-led grading workflows.</p> <div dir="auto"> <h2>What It Does</h2> </div> <p dir="auto">Given a <strong>scenario description</strong> and a student's submitted <strong>requirements document</strong>, the tool automatically evaluates:</p> <div dir="auto"> <h3>Standards-based Mode</h3> </div> <p dir="auto">Evaluates requirements and user stories against established software engineering standards:</p> <table> <thead> <tr> <th>Standard</th> <th>Applied To</th> <th>Characteristics</th> </tr> </thead> <tbody> <tr> <td><strong>ISO/IEC/IEEE 29148</strong></td> <td>Functional & non-functional requirements</td> <td>Necessary, Correct, Unambiguous, Complete, Consistent, Feasible, Verifiable, Traceable, Conforming</td> </tr> <tr> <td><strong>QUS</strong> (Quality User Stories)</td> <td>User stories</td> <td>Feature Specificity, Rationale Clarity, Problem Oriented, Language Clarity, Internal Consistency</td> </tr> <tr> <td><strong>INVEST</strong></td> <td>User stories</td> <td>Independent, Negotiable, Valuable, Estimable, Small, Testable</td> </tr> </tbody> </table> <div dir="auto"> <h3>Custom Mode</h3> </div> <p dir="auto">Evaluates requirements against a fully configurable set of characteristics. You can select from predefined custom criteria in the sidebar or define your own with a label and definition.</p> <div dir="auto"> <h2>Key Features</h2> </div> <ul> <li><strong>Multiple LLM providers</strong> — Claude (Anthropic), GPT (OpenAI), Llama (via OpenRouter), or Mock mode for demos</li> <li><strong>Flexible input</strong> — Paste requirements as plain text or upload a CSV/TXT file</li> <li><strong>Per-characteristic scoring</strong> — Each requirement or user story receives a <code>Yes / Partial / No</code> verdict and a numeric score (1.0 / 0.5 / 0.0) per characteristic</li> <li><strong>Visual summaries</strong> — Bar charts showing average score per characteristic (Standards-based mode)</li> <li><strong>Downloadable reports</strong> — Export results as a formatted <strong>PDF</strong> or <strong>Excel</strong> file, including scores, charts, characteristics used, and full evaluation tables</li> </ul> <div dir="auto"> <h2>Using the App</h2> </div> <ol> <li><strong>Settings sidebar</strong> — Choose an LLM provider, enter your API key, and select the evaluation type (Standards-based or Custom)</li> <li><strong>Student details</strong> — Enter the student's name, ID, and any extra notes</li> <li><strong>Scenario</strong> — Describe the system or context the requirements relate to</li> <li><strong>Student submission</strong> — Paste text directly, or upload a CSV/TXT file containing requirements and user stories</li> <li><strong>Evaluate</strong> — Click <strong> Evaluate Submission</strong></li> <li><strong>Results</strong> — View per-characteristic scores and verdicts, browse charts, and download the full PDF or Excel report</li> </ol> <div dir="auto"> <h2>CSV/TXT Upload Format</h2> </div> <p dir="auto">When using file upload, the file must contain at least these two columns:</p> <div> <pre><code>id, body </code></pre> </div> <p dir="auto">Rows are classified automatically by their <strong>ID prefix</strong>:</p> <table> <thead> <tr> <th>ID prefix</th> <th>Classified as</th> </tr> </thead> <tbody> <tr> <td><code>R_</code></td> <td>Functional requirement</td> </tr> <tr> <td><code>NFR</code></td> <td>Non-functional requirement</td> </tr> <tr> <td><code>US</code></td> <td>User story</td> </tr> </tbody> </table> <p dir="auto">Example:</p> <div dir="auto"> <pre>id,body R_1,The system shall allow users to log in with email and password. R_2,The system shall support role-based access control. NFR1,The system shall respond within 2 seconds under normal load. NFR2,The system shall be available 99.9% of the time. US1,As a user I want to reset my password so that I can regain access to my account. US2,As an admin I want to view all users so that I can manage accounts.</pre> </div> <div dir="auto"> <h2>LLM Providers</h2> </div> <table> <thead> <tr> <th>Provider</th> <th>Model</th> <th>API Key Required</th> </tr> </thead> <tbody> <tr> <td><strong>Mock</strong></td> <td>Simulated responses</td> <td>No</td> </tr> <tr> <td><strong>Claude</strong></td> <td>Claude Sonnet 4 (Anthropic)</td> <td>Yes — <a href="https://console.anthropic.com/" rel="nofollow">console.anthropic.com</a></td> </tr> <tr> <td><strong>GPT</strong></td> <td>GPT-5.2 (OpenAI)</td> <td>Yes — <a href="https://platform.openai.com/" rel="nofollow">platform.openai.com</a></td> </tr> <tr> <td><strong>Llama</strong></td> <td>Llama 4 Maverick via OpenRouter</td> <td>Yes — <a href="https://openrouter.ai/" rel="nofollow">openrouter.ai</a></td> </tr> </tbody> </table> <p> </p>