Personal Site: “Crawled - Currently Not Indexed” Fix
Problem
Google Search Console shows pages as “Crawled - currently not indexed”, meaning Google successfully crawled the pages but chose not to index them. This typically happens when:
- Empty meta descriptions - Makes Google think pages are low quality
- Missing canonical URLs - Google can’t determine preferred page versions
- Incomplete sitemap - Missing pages from sitemap
Root Causes Identified
- Empty Description in
_config.yml:- Had
description: [](empty array) - Caused Jekyll SEO plugin to generate empty meta descriptions
- Google sees empty descriptions as low-quality content
- Had
- Incomplete Sitemap:
- Only contained homepage URL
- Missing blog posts (
_posts/) - Missing other pages (about, blog, docs, etc.)
- Missing Canonical URLs:
- No canonical tags in layout
- Google couldn’t determine preferred page versions
Solutions Implemented
✅ 1. Fixed Site Description
File: _config.yml
Before:
description: []
After:
description: "Thomas Serre is the Thomas J. Watson, Sr. Professor of Science at Brown University. Research in computational neuroscience, AI, brain-inspired vision models, visual perception, and explainable AI."
Impact: Jekyll SEO plugin now generates proper meta descriptions for all pages.
✅ 2. Added Canonical URLs
File: _layouts/default.html
Added:
<link rel="canonical" href="https://thomas-serre.com/INDEXING_FIX.html">
Impact: Every page now has a canonical URL telling Google which version to index.
✅ 3. Created Complete Sitemap
File: sitemap.xml
Before: Only homepage
After: Includes:
- All blog posts from
_posts/ - All pages (homepage, about, blog, docs, etc.)
- Proper priorities and change frequencies
- Last modified dates
Impact: Google can discover all pages and understand site structure.
How It Works
Sitemap Generation
The sitemap is a Jekyll template that automatically:
- Lists all blog posts with their publication dates
- Lists all pages (excluding system files like
_site,_posts,404.html, etc.) - Sets appropriate priorities (homepage = 1.0, others = 0.7-0.8)
- Sets change frequencies (homepage = weekly, others = monthly)
Canonical URLs
Every page now includes:
<link rel="canonical" href="https://tserre.github.io/page-url">
This tells Google:
- Which URL is the “official” version
- Prevents duplicate content issues
- Helps with indexing decisions
Meta Descriptions
Jekyll SEO plugin (`
`) now generates proper descriptions because:
- Site has a valid description in
_config.yml - Each page can override with front matter if needed
- Google sees meaningful content summaries
Next Steps
- Wait for GitHub Pages to Rebuild (5-10 minutes):
- GitHub Pages will rebuild the site with new changes
- Sitemap will be regenerated with all pages
- Canonical URLs will appear on all pages
- Submit Updated Sitemap (in Google Search Console):
- Go to Google Search Console → Sitemaps
- Submit/refresh:
https://tserre.github.io/sitemap.xml - Google will discover all pages
- Request Indexing (Optional, for faster results):
- Use URL Inspection tool in Search Console
- Enter each page URL
- Click “Request Indexing”
- Speeds up the process but not required
- Monitor Progress (1-2 weeks):
- Check Coverage report in Search Console
- Pages should move from “Crawled - not indexed” to “Indexed”
- Should see improvement over 1-2 weeks
Expected Timeline
- Immediate: Changes are live on GitHub
- 5-10 minutes: GitHub Pages rebuilds site
- 1-3 days: Googlebot may recrawl some pages
- 1-2 weeks: Most pages should be indexed
- Ongoing: Google continues to discover and index content
Files Changed
- ✅
_config.yml- Added site description - ✅
_layouts/default.html- Added canonical URL - ✅
sitemap.xml- Complete sitemap with all pages
Additional Notes
Branch
Changes were committed to dev branch. If GitHub Pages deploys from main, you may need to merge:
git checkout main
git merge dev
git push origin main
Jekyll SEO Plugin
The site uses Jekyll SEO plugin (`
`) which automatically generates:
- Page titles
- Meta descriptions
- Open Graph tags
- Twitter Card tags
- Structured data
With the fixed description in _config.yml, all of these will now work properly.
Summary
✅ Fixed: Empty site description
✅ Fixed: Missing canonical URLs
✅ Fixed: Incomplete sitemap
✅ Deployed: Changes pushed to GitHub
The “Crawled - currently not indexed” issue should resolve as Google recrawls pages with proper meta descriptions, canonical URLs, and a complete sitemap.