7 Lessons I Learned While Writing Code With AI
Introduction
I think AI is a wonderful tool. Regardless, there is both excitement and concern in the software development community due to the recent advancements in artificial intelligence. While some fear that AI might replace human developers, the reality is far more nuanced and promising. At least as a developer wanting to get better at writing code, I see AI as a powerful ally in the development process and I will use this article to demonstrate why.
Prompt Engineering
One crucial skill that has emerged recently involves crafting precise and effective prompts to get the most accurate and useful responses from AI tools. When we use these tools to augment and enhance certain digital processes, the ability to communicate effectively with AI systems through well-constructed prompts will become as important as traditional coding skills.
Prompt engineering is not just about asking the right questions; it’s about understanding how to frame your requests in a way that aligns with the AI’s capabilities and limitations. It requires a deep understanding of the AI tool you’re working with, as well as the specific task at hand. By mastering prompt engineering, developers can unlock the full potential of AI assistants, getting more accurate code suggestions, better problem-solving insights, and more efficient automation of tasks.
In this article, as we build a basic program with Python, we’ll explore some basic tips and ideas for better leveraging Large Language Models (LLMs) — that is, we’ll learn how to better engineer our prompts!
If you just want to take a look at the finished code, please visit my Github.
For this particular example, we’ll be working with Claude 3.5 Sonnet developed by Anthropic. While Sonnet requires a paid subscription, there are free models also available that perform quite well at this particular task.
The Initial Concept
I started with a simple goal: to build a Python-based web scraper that could crawl all pages of any small static website, extract text content, and compile it into a single text file. The key requirements for the program were:
- Visit each page associated with the domain
- Scrape all text content from each visited page
- Collate the content into one text file
Laying the Groundwork
Prompting Tip 1: Clarity and Iteration
- Start with a clear, specific request
- If the AI’s response isn’t quite what you need, refine your prompt
- Think of it as an ongoing dialogue to reach the desired outcome
Before diving into coding, I asked Claude to provide the basic logic for this program. This step was crucial to ensure that the AI understood my requirements correctly.
This was the gist of Claude’s response:
Refining the Approach
Prompting Tip 2: Ask for Alternatives
- There might be better solutions to solving the problem
- Be specific abou what aspects you want alternatives for
- Ask for pros and cons of each alternative
- Inquire about industry best practices or design patterns that could be applied
I gave Claude a rather basic prompt, but you get the idea…
This is a part of Claude’s response. Needless to say, there were several things on this list that I had not thought of, especially the idea of utilising sitemaps.
Prompting Tip 3: Break the Problem Down
- Request a step-by-step breakdown of complex processes
- Ask for suggestions on how to modularize the code
- Seek advice on creating clear interfaces between components
- Inquire about potential challenges for each sub-task
My prompt:
Claude’s response:
It also gave me an overview of what each file and function would do:
Modules and their main functions:
main.py
- run(): Orchestrates the entire scraping process
sitemap_parser.py
- fetch_sitemap(url): Retrieves the sitemap XML
- parse_sitemap(xml): Extracts URLs from the sitemap
- get_all_urls(base_url): Combines fetching and parsing
content_extractor.py
- fetch_page(url): Retrieves a single webpage
- extract_text(html): Parses HTML and extracts relevant text
- process_page(url): Combines fetching and extraction
url_processor.py
- create_url_batches(urls, batch_size): Splits URL list into batches
- process_url_batch(batch): Handles a batch of URLs in parallel
file_handler.py
- initialize_output_file(filename): Creates and prepares the output file
- write_to_file(filename, content): Appends extracted content to the file
- finalize_file(filename): Performs any necessary cleanup or formatting
config.py
- Stores configuration variables (e.g., batch size, output filename, user agent)
Choosing the Right Tools
Prompt Tip 4: Do We Have Everything We Need?
- Ask about required Python version and compatible operating systems
- Inquire about necessary third-party libraries and their versions
- Request information on any system dependencies (e.g., database engines, web servers)
- Inquire about minimum hardware requirements (RAM, CPU, disk space)
- Ask about required network access or firewall configurations
- Ask about recommended tools for debugging or monitoring the program
My prompt:
Claude’s response:
It also gave me instructions on how to set up a virtual environment:
For the development environment, Claude recommended using a virtual environment with Python 3.8+ and after doing some Googling (just to be sure), these were the following libraries I realised were critical:
- requests: for making HTTP requests
- beautifulsoup4: for parsing HTML and extracting data
- urllib: for URL handling and parsing
- logging: for better debugging and error tracking
Writing the Code
Prompting Tip 5: Ask the AI to Ask You for More Specifics
With a clear plan in place, I asked Claude to write the code, but I also wanted to again make sure that everything I needed was accounted for, so I asked it to ask me questions so it could provide the most accurate answer.
Claude’s response:
I answered:
Based on my response, it provided the code split into different files.
It also provided the command I would need to run the program:
However, once I ran the code, I realised there was a major problem…
Debugging and Refinement
Prompting Tip 6: Debugging with AI
- Provide the exact error message or unexpected behavior
- Ask the AI to identify potential problem areas in your code
- Inquire about edge cases that might be triggering the error
- Inquire about best practices to prevent similar errors in the future
Upon running the initial code, I encountered several issues:
- XML parsing errors when trying to parse HTML content
- Incomplete scraping results
- Inability to scrape certain websites due to anti-bot measures
I shared these errors with Claude, and it then suggested improvements:
However, I was still getting issues. After some trial and error, it realised there is a problem with how the sitemaps were being searched for and quickly implemented a solution.
This updated version:
- Would fetch both
sitemap_index.xml
andsitemap.xml
. - Handle potential XML parsing errors more gracefully.
- Fall back to HTML parsing if no sitemap is found or if XML parsing fails.
- Use
urljoin
to handle relative URLs properly.
Final Touches
The program was now working as intended, but there were still things it wasn’t quite doing so well. There is always scope for improvement, but first, we need to figure out where to implement those changes.
Prompting Tip 7: Ask for Improvements
- Ask for suggestions to improve code readability and maintainability
- Request ideas for making the code more modular or reusable
- Inquire about ways to enhance error handling and robustness
- Request suggestions for improving the user interface or user experience
Claude always has numerous suggestions but it is up to you, the developer to use your intuition and experience and decide which ones actually make sense.
For example, for a program that only scrapes small, static websites, concurrent processing makes little difference. However, Claude suggested many other changes that could make the program much more solid:
- Implement proper rate limiting: The current implementation uses a simple
time.sleep(1)
for rate limiting. A more sophisticated approach would be to use a rate limiting library likeratelimit
to ensure compliance with the target website's policies. - Add retry mechanism: Implement a retry mechanism for failed requests to improve reliability.
- Use a proper HTML parser: While BeautifulSoup is good, consider using a more efficient parser like lxml for better performance.
- Implement proper error handling and logging: Enhance error handling and logging to provide more detailed information about failures and simplify debugging.
- Use a configuration file: Move configuration parameters (like USER_AGENT, HEADERS, etc.) to a separate configuration file (e.g., YAML or JSON) for easier management.
- Implement checkpointing: Add a mechanism to save progress and resume scraping from where it left off in case of interruptions.
Lessons Learned
This project taught me several valuable lessons:
- Start with a clear, well-defined goal
- Use AI as a collaborative tool, not just a code generator
- Iterative development and testing is crucial
- Always consider ethical implications and respect website policies
- Error handling and logging are as important as the core functionality
Here is the final code if you want to take a look at it in its entirety.
Happy coding!