logo_smallAxellero.io

Web Search

Multi-engine web search with advanced filtering, ranking, and result aggregation capabilities.

Search multiple search engines simultaneously with advanced filtering, ranking algorithms, and comprehensive result processing for research and data collection.

Features

  • Multi-engine search support (Google, Bing, DuckDuckGo, Academic)
  • Advanced query filtering and ranking
  • Result deduplication and relevance scoring
  • Content type and domain filtering
  • Date range and language restrictions

Connector Options

The node uses reusable connector configuration that applies to all search operations:

ParameterTypeRequiredDescription
apiKeysObjectYesAPI keys for search engines (google, bing, etc.)
defaultEngineTEXTNoDefault search engine when not specified
maxResultsINTNoDefault maximum results per search (default: 50)
rateLimitINTNoRequests per minute limit (default: 60)

Methods

webSearch

Execute multi-engine web searches with advanced filtering and ranking.

ParameterTypeRequiredDescription
queryTEXTYesSearch query string
enginesArrayNoSearch engines to use: google, bing, duckduckgo, academic
maxResultsINTNoMaximum number of results to return
filtersObjectNoSearch filters and restrictions
rankingTEXTNoResult ranking method: relevance, date, authority
{
  "query": "machine learning python tutorials",
  "engines": ["google", "bing", "duckduckgo"],
  "maxResults": 50,
  "filters": {
    "dateRange": "past_year",
    "contentType": ["article", "tutorial"],
    "language": "en",
    "domain": ["github.com", "stackoverflow.com"]
  },
  "ranking": "relevance"
}

Output:

  • results (Array) - Search results with metadata
  • totalResults (Number) - Total number of results found
  • searchTime (Number) - Search execution time in milliseconds
  • engines (Array) - Engines used and their response status
  • query (Object) - Processed query information

academicSearch

Search academic databases and research repositories for scholarly content.

ParameterTypeRequiredDescription
queryTEXTYesAcademic search query
publicationTypeArrayNoPublication types: journal, conference, book, thesis
dateRangeTEXTNoPublication date range
citationThresholdINTNoMinimum citation count
subjectArrayNoSubject areas to search within
{
  "query": "natural language processing transformers",
  "publicationType": ["journal", "conference"],
  "dateRange": "past_5_years",
  "citationThreshold": 10,
  "subject": ["computer_science", "artificial_intelligence"]
}

newsSearch

Search news sources and current events with real-time filtering.

ParameterTypeRequiredDescription
queryTEXTYesNews search query
sourcesArrayNoSpecific news sources to search
sentimentTEXTNoFilter by sentiment: positive, negative, neutral
freshnessTEXTNoContent freshness: hour, day, week, month
{
  "query": "sustainable energy technology",
  "sources": ["reuters", "bloomberg", "techcrunch"],
  "sentiment": "neutral",
  "freshness": "week"
}

Search Filters

Content Filtering

FilterTypeDescriptionExample Values
dateRangeTEXTTime period for results"past_day", "past_week", "past_month", "past_year"
contentTypeArrayType of content to find["article", "blog", "news", "tutorial", "research"]
languageTEXTContent language"en", "es", "fr", "de", "zh"
domainArraySpecific domains to include["github.com", "stackoverflow.com", "arxiv.org"]
excludeDomainsArrayDomains to exclude["pinterest.com", "facebook.com"]

Quality Filters

FilterTypeDescriptionExample Values
minWordCountINTMinimum content length500
authorityScoreFLOATDomain authority threshold0.7
readabilityLevelTEXTContent reading level"basic", "intermediate", "advanced"
contentQualityTEXTQuality assessment"high", "medium", "any"

Result Processing

Ranking Algorithms

Relevance Ranking:

  • Query term frequency and positioning
  • Title and header matching weights
  • Domain authority scoring
  • Content freshness factors
  • User engagement signals

Authority Ranking:

  • Domain reputation scores
  • Backlink analysis
  • Content creator credibility
  • Publication venue quality
  • Citation counts (academic content)

Date Ranking:

  • Publication or update timestamps
  • Content freshness scoring
  • Trend relevance analysis
  • Temporal query matching

Deduplication

  • URL normalization and comparison
  • Content similarity detection
  • Title and description matching
  • Domain clustering analysis
  • Canonical URL resolution

Performance Optimization

Caching Strategy

{
  "query": "artificial intelligence trends",
  "engines": ["google", "bing"],
  "caching": {
    "enabled": true,
    "ttl": 3600,
    "keyFactors": ["query", "filters", "engines"],
    "compression": true
  }
}

Batch Operations

{
  "batchSearch": {
    "queries": [
      "machine learning applications",
      "deep learning frameworks", 
      "neural network architectures"
    ],
    "engines": ["google", "academic"],
    "maxResults": 30,
    "parallel": true,
    "rateLimiting": {
      "requestsPerMinute": 30,
      "delayBetweenRequests": 2000
    }
  }
}

Error Handling

Common Error Responses

Error TypeCauseResolution
QUOTA_EXCEEDEDAPI limit reachedWait for quota reset or use different engine
INVALID_QUERYMalformed search queryCheck query syntax and special characters
ENGINE_UNAVAILABLESearch engine downUse alternative engines or retry later
RATE_LIMITEDToo many requestsImplement request throttling

Error Response Format

{
  "success": false,
  "error": {
    "type": "QUOTA_EXCEEDED",
    "message": "Daily search quota exceeded for Google Search API",
    "engine": "google",
    "retryAfter": 3600,
    "suggestions": [
      "Use alternative search engines",
      "Wait for quota reset",
      "Upgrade API plan"
    ]
  }
}

Usage Examples

{
  "query": "Python web scraping tutorial",
  "engines": ["google", "bing"],
  "maxResults": 20,
  "filters": {
    "contentType": ["tutorial", "documentation"],
    "language": "en"
  }
}
{
  "query": "quantum computing algorithms",
  "engines": ["academic"],
  "maxResults": 25,
  "filters": {
    "publicationType": ["journal", "conference"],
    "dateRange": "past_3_years",
    "citationThreshold": 5
  },
  "ranking": "authority"
}

Competitive Intelligence

{
  "query": "Company XYZ product launches 2024",
  "engines": ["google", "bing"],
  "maxResults": 100,
  "filters": {
    "dateRange": "past_year",
    "contentType": ["news", "press_release"],
    "domain": ["techcrunch.com", "venturebeat.com", "bloomberg.com"]
  },
  "ranking": "date"
}

Integration Patterns

With Data Analysis Tools

Search for data and automatically process results for insights and pattern detection.

With File System Tools

Save search results to structured files for offline analysis and archival.

With Web Scraping Tools

Use search results as input for targeted content extraction workflows.

Best Practices

Query Optimization

  • Use specific, targeted keywords
  • Include relevant context terms
  • Utilize search operators when available
  • Test queries across multiple engines

Result Quality

  • Set appropriate quality filters
  • Use domain restrictions wisely
  • Balance quantity vs. relevance
  • Monitor and adjust ranking algorithms

Resource Management

  • Implement proper rate limiting
  • Cache frequently used searches
  • Monitor API quota usage
  • Use batch operations efficiently

Ethical Considerations

  • Respect search engine terms of service
  • Implement appropriate delays between requests
  • Avoid overwhelming target websites
  • Consider data privacy and user consent

Getting Started

  1. Configure API keys for desired search engines
  2. Set default search parameters and rate limits
  3. Test basic queries with different engines
  4. Implement error handling and retry logic
  5. Monitor search performance and optimization

Resources