### Talk2Scholars Tutorial

This tutorial will walk you through the process of using Talk2Scholars for academic paper search and analysis. We'll cover installation, setup, and how to use each of the available tools.

#### Understanding the Workflow

Our hierarchical agent system follows a specific workflow:

1. Query Processing:
   - User query is received by the main agent
   - Supervisor agent determines the appropriate sub-agent
   - S2 agent handles academic paper operations

2. Result Management:
   - Tools perform specific operations (search, recommendations)
   - Results are stored in the state
   - display_results tool acts as the interface for retrieving results

3. Result Presentation:
   - display_results ensures consistent formatting
   - Handles error cases (no results found)
   - Provides structured output for visualization

The display_results tool is essential as it:
- Provides a consistent interface for accessing results
- Ensures proper error handling
- Maintains data structure consistency
- Makes the codebase more maintainable

Example workflow diagram:
User Query → Main Agent → S2 Agent → Tool Operations → display_results → Formatted Output

#### Installation

First, install the aiagents4pharma package using pip:

In [1]:
# !pip install aiagents4pharma
# This will install the package and all required dependencies

#### Set up your API key

Before using the tools, you need to set up your OpenAI API key. You can do this in two ways:

##### Option 1: The recommended way to manage your API keys securely.

1. Create a `.env` file in your project root.
    Example project structure:
    ```bash
    AIAgents4Pharma/
    ├── .env
    ```

2. Add the following line to your `.env` file (replace 'your_api_key_here' with your actual key):
    ```bash
    OPENAI_API_KEY=your_api_key_here
    ```

3. Use the code below to load your API key from the `.env` file. 
   If your `.env` file is not in the same directory, specify its location using a **dummy path**:

In [2]:
# Import required libraries
import os
from dotenv import load_dotenv

In [3]:
# Specify the path to your .env file
dotenv_path = "./.env"  # Replace this with the actual path to your .env file if it's not in the same directory
load_dotenv(dotenv_path)

# Fetch the API key from the environment variables
api_key = os.getenv("OPENAI_API_KEY")

# Validate that the API key was loaded successfully
if not api_key:
    raise ValueError(f"API key not found in {dotenv_path}! Ensure the .env file exists and contains the API key.")

# Print statement for debugging (optional, remove in production)
print(f"Loaded API key: {api_key[:5]}... (truncated for security)")

Loaded API key: sk-pr... (truncated for security)


##### Option 2: Set API Key Directly (for testing only)

If you are just testing the script and don’t want to create a `.env` file, 
you can set the API key directly in your script. 
Note: Avoid using this method in production as it exposes your key in plain text.

In [4]:
# Set API key directly (for testing only)
# os.environ["OPENAI_API_KEY"] = "your_api_key_here"

# Example of using the API key
def example_functionality():
    if not api_key:
        raise ValueError("API key is not set! Ensure you loaded it correctly.")
    print("API key is ready to use!")

# Call the example function
example_functionality()

API key is ready to use!


#### 1. Paper Search Tool

Let's start by searching for academic papers. Note that we'll use display_results as our 
interface for viewing results, maintaining a consistent pattern throughout the workflow.

In [5]:
# Import the package
import pandas as pd
from aiagents4pharma.talk2scholars.tools.s2.search import search_tool
from aiagents4pharma.talk2scholars.tools.s2.display_results import display_results

# Define search parameters
search_input = {
    "query": "machine learning in healthcare",
    "limit": 5,
    "year": "2023-",
    "tool_call_id": "search_demo_1"
}

try:
    # Execute search
    search_results = search_tool.invoke(input=search_input)
    
    # Use display_results as the interface to view results
    state = {"papers": search_results.update.get("papers", {})}
    results = display_results.invoke({"state": state})
    print("Search results retrieved successfully")
except Exception as e:
    print(f"Error during search: {e}")

  from .autonotebook import tqdm as notebook_tqdm


Starting paper search...


INFO:aiagents4pharma.talk2scholars.tools.s2.search:Received 0 papers
INFO:aiagents4pharma.talk2scholars.tools.s2.display_results:Displaying papers from the state
INFO:aiagents4pharma.talk2scholars.tools.s2.display_results:No papers found in state, indicating search is needed


Search results retrieved successfully


#### 2. Display Results Tool

The display_results tool is a crucial interface component in our workflow. Unlike directly 
accessing search or recommendation results, this tool provides:

1. Standardized Interface:
   - Consistent format for all paper data
   - Proper error handling
   - State validation

2. Result Management:
   - Handles both search results and recommendations
   - Validates paper data structure
   - Manages empty result cases

Here's how to use the display_results tool:

In [None]:
from aiagents4pharma.talk2scholars.tools.s2.display_results import display_results

# Example 1: Displaying search results
def display_search_results():
    try:
        # First get some search results
        search_input = {
            "query": "machine learning in healthcare",
            "limit": 5,
            "year": "2023-",
            "tool_call_id": "search_demo_1"
        }
        search_results = search_tool.invoke(input=search_input)
        
        # Use display_results as the interface
        state = {"papers": search_results.update.get("papers", {})}
        results = display_results.invoke({"state": state})
        print("Results retrieved successfully")
        return results
    except Exception as e:
        print(f"Error displaying results: {e}")
        return None

# Example 2: Handling empty results
def handle_empty_results():
    try:
        # Create an empty state
        empty_state = {"papers": {}}
        results = display_results.invoke({"state": empty_state})
        print("Empty state handled successfully")
    except Exception as e:
        # This should raise NoPapersFoundError
        print(f"Expected error for empty results: {e}")

# Example 3: Error handling with invalid state
def handle_invalid_state():
    try:
        # Try to display results with invalid state
        invalid_state = {"invalid_key": {}}
        results = display_results.invoke({"state": invalid_state})
    except Exception as e:
        print(f"Expected error for invalid state: {e}")

"""
The display_results tool is designed to be used after any operation that retrieves papers:
- After paper searches
- After getting recommendations
- After main agent operations

Always use display_results instead of accessing raw data directly. This ensures:
1. Consistent error handling
2. Proper state validation
3. Standardized result format
"""

#### 3. Single Paper Recommendations

Get recommendations based on a specific paper, using display_results as our interface.

In [7]:
from aiagents4pharma.talk2scholars.tools.s2.single_paper_rec import get_single_paper_recommendations

# Get first paper ID from previous search results
paper_id = next(iter(search_results.update.get("papers", {})), None)

if paper_id:
    try:
        # Get recommendations
        rec_input = {
            "paper_id": paper_id,
            "limit": 3,
            "year": "2022-",
            "tool_call_id": "rec_demo_1"
        }
        recommendations = get_single_paper_recommendations.invoke(input=rec_input)
        
        # Use display_results to view recommendations
        state = {"papers": recommendations.update.get("papers", {})}
        results = display_results.invoke({"state": state})
        print("Recommendations retrieved successfully")
    except Exception as e:
        print(f"Error getting recommendations: {e}")

#### 4. Multi-Paper Recommendations

Get recommendations based on multiple papers, maintaining our consistent interface pattern.

In [8]:
from aiagents4pharma.talk2scholars.tools.s2.multi_paper_rec import get_multi_paper_recommendations

paper_ids = list(search_results.update.get("papers", {}).keys())[:2]

if paper_ids:
    try:
        # Get multi-paper recommendations
        multi_rec_input = {
            "paper_ids": paper_ids,
            "limit": 3,
            "year": "2022-",
            "tool_call_id": "multi_rec_demo_1"
        }
        multi_recommendations = get_multi_paper_recommendations.invoke(input=multi_rec_input)
        
        # Use display_results as the interface
        state = {"papers": multi_recommendations.update.get("papers", {})}
        results = display_results.invoke({"state": state})
        print("Multi-paper recommendations retrieved successfully")
    except Exception as e:
        print(f"Error getting multi-paper recommendations: {e}")

#### Using the Main Agent

The main agent provides an integrated experience that coordinates between different tools. 
It uses a hierarchical system where:
1. The supervisor agent routes queries appropriately
2. The S2 agent handles academic paper operations
3. The display_results tool manages result presentation

This structure ensures a consistent and reliable user experience.

In [9]:
from aiagents4pharma.talk2scholars.agents.main_agent import get_app
from langchain_core.messages import HumanMessage
from aiagents4pharma.talk2scholars.state.state_talk2scholars import Talk2Scholars

thread_id = "tutorial_demo"
app = get_app(thread_id=thread_id, llm_model="gpt-4o-mini")

# Create initial state
initial_state = Talk2Scholars(
    messages=[
        HumanMessage(content="search for recent papers about machine learning in healthcare")
    ]
)

try:
    # Run query through the main agent
    response = app.invoke(
        initial_state,
        config={
            "configurable": {
                "thread_id": thread_id,
                "checkpoint_ns": "demo_namespace",
                "checkpoint_id": "initial_checkpoint",
            }
        },
    )
    
    # Consistently use display_results as our interface
    state = {"papers": response.get("papers", {})}
    results = display_results.invoke({"state": state})
    print("Results retrieved successfully through main agent")
except Exception as e:
    print(f"Error during main agent execution: {e}")

INFO:aiagents4pharma.talk2scholars.agents.main_agent:Hydra configuration loaded with values: {'_target_': 'agents.main_agent.get_app', 'openai_api_key': '${oc.env:OPENAI_API_KEY}', 'openai_llms': ['gpt-4o-mini', 'gpt-4-turbo', 'gpt-3.5-turbo'], 'temperature': 0, 'main_agent': 'You are an intelligent research assistant coordinating academic paper discovery and analysis.\nAVAILABLE TOOLS AND ROUTING: 1. semantic_scholar_agent:\n   Access to tools:\n   - search_tool: For paper discovery\n   - display_results: For showing paper results\n   - get_single_paper_recommendations: For single paper recommendations\n   - get_multi_paper_recommendations: For multi-paper recommendations\n   → ROUTE TO THIS AGENT FOR: Any query about academic papers, research, or articles\n\nROUTING LOGIC: 1. For ANY query mentioning:\n   - Papers, articles, research, publications\n   - Search, find, show, display, get\n   → Route to semantic_scholar_agent\n\n2. Error Handling:\n   - If semantic_scholar_agent reports

decision Routing your request to the semantic_scholar_agent for paper discovery on recent papers about machine learning in healthcare.


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Starting paper search...


INFO:aiagents4pharma.talk2scholars.tools.s2.search:Received 0 papers
INFO:aiagents4pharma.talk2scholars.state.state_talk2scholars:Updating existing state {} with the state dict: {}
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:aiagents4pharma.talk2scholars.tools.s2.display_results:Displaying papers from the state
INFO:aiagents4pharma.talk2scholars.tools.s2.display_results:No papers found in state, indicating search is needed
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Starting paper search...


INFO:aiagents4pharma.talk2scholars.tools.s2.search:Received 5 papers
INFO:aiagents4pharma.talk2scholars.state.state_talk2scholars:Updating existing state {} with the state dict: {'ac2cffc4b9f96bae24809d738777ae897094ae33': {'Title': 'A Comprehensive Review on Machine Learning in Healthcare Industry: Classification, Restrictions, Opportunities and Challenges', 'Abstract': 'Recently, various sophisticated methods, including machine learning and artificial intelligence, have been employed to examine health-related data. Medical professionals are acquiring enhanced diagnostic and treatment abilities by utilizing machine learning applications in the healthcare domain. Medical data have been used by many researchers to detect diseases and identify patterns. In the current literature, there are very few studies that address machine learning algorithms to improve healthcare data accuracy and efficiency. We examined the effectiveness of machine learning algorithms in improving time series healt

Results retrieved successfully through main agent


#### Understanding the Role of display_results

The display_results tool is not just a formatting utility - it's our primary interface for 
retrieving results throughout the system. This consistent pattern provides several benefits:

1. Standardized Access:
   - All results are accessed through the same interface
   - Consistent error handling across all operations
   - Uniform data structure for all results

2. State Management:
   - Proper validation of state contents
   - Clear separation between raw data and presented results
   - Consistent handling of empty or invalid states

3. System Integration:
   - Seamless integration with the hierarchical agent system
   - Consistent interface for both simple queries and complex workflows
   - Reliable result retrieval regardless of operation type

This is why we consistently use display_results instead of accessing raw data directly,
ensuring a robust and maintainable system.

#### Testing the Application

The application includes comprehensive tests. Here's how to run them:

In [10]:
# !pytest tests/test_langgraph.py -v
# This will run all tests and show detailed output

#### Tips for Best Results

1. For paper searches:
   - Use specific academic terms
   - Include relevant keywords
   - Specify year ranges when needed

2. For recommendations:
   - Use paper IDs from search results
   - Try both single and multi-paper recommendations
   - Adjust the limit parameter based on your needs

3. Error handling:
   - Always check the response structure
   - Use try-except blocks for robust error handling
   - Verify paper IDs exist before requesting recommendations