File: tutorial-09-http_file_server.md

package info (click to toggle)
workflow 0.11.10-1
links: PTS, VCS
area: main
in suites: forky, sid
size: 2,744 kB
sloc: cpp: 33,792; ansic: 9,393; makefile: 9; sh: 6
file content (202 lines) | stat: -rw-r--r-- 8,612 bytes
parent folder | download | duplicates (2)
# Http server with file IO: http\_file\_server

# Sample code

[tutorial-09-http\_file\_server.cc](/tutorial/tutorial-09-http_file_server.cc)

# About http\_file\_server

http\_file\_server is a web server. You can start a web server after specifying the startup port and the root path (the default setting is the current path).   
You can also specify a certificate file and a key file in PEM format to start an HTTPS web server. User may access the server through command line, the request will be sent to IP address 127.0.0.1.  
The program mainly demonstrates how to use disk IO tasks. In the Linux system, we use the aio interface in the kernel of Linux, and the file reading is completely asynchronous.

# Starting a server

For starting a server, the steps are almost the same as those when starting an echo server or an HTTP proxy. There is one more way to start an SSL server here:

~~~cpp
class WFServerBase
{
    ...
    int start(unsigned short port, const char *cert_file, const char *key_file);
    ...
};
~~~

In other words, you can specify a cert file and a key file in PEM format to start an SSL server.   
In addition, when you define a server, you can use **std::bind()** to bind a root parameter to the process. The root parameter means the root path of the service.

~~~cpp
void process(WFHttpTask *server_task, const char *root)
{
    ...
}

int main(int argc, char *argv[])
{
    ...
    const char *root = (argc >= 3 ? argv[2] : ".");
    auto&& proc = std::bind(process, std::placeholders::_1, root);
    WFHttpServer server(proc);

    // start server
    ...
}
~~~

# Handling requests

Similar to http\_proxy, no threads are occupied in file reading. Instead, an asynchronous task is generated to read files, and a reply to the request is generated after the reading is completed.   
Please note again that the complete reply data should be read into the memory before the reply message is sent. Therefore, it is not suitable for transferring very large files.

~~~cpp
void process(WFHttpTask *server_task, const char *root)
{
    // generate abs path.
    ...

    int fd = open(abs_path.c_str(), O_RDONLY);
    if (fd >= 0)
    {
        size_t size = lseek(fd, 0, SEEK_END);
        void *buf = malloc(size);        /* As an example, assert(buf != NULL); */
        WFFileIOTask *pread_task;

        pread_task = WFTaskFactory::create_pread_task(fd, buf, size, 0,
                                                      pread_callback);
        /* To implement a more complicated server, please use series' context
         * instead of tasks' user_data to pass/store internal data. */
        pread_task->user_data = resp;    /* pass resp pointer to pread task. */
        server_task->user_data = buf;    /* to free() in callback() */
        server_task->set_callback([](WFHttpTask *t){ free(t->user_data); });
        series_of(server_task)->push_back(pread_task);
    }
    else
    {
        resp->set_status_code("404");
        resp->append_output_body("<html>404 Not Found.</html>");
    }
}
~~~

Unlike http\_proxy that generates a new HTTP client task, here a pread task is generated by the factory.   
[WFAlgoTaskFactory.h](/src/factory/WFTaskFactory.h) contains the definitions of relevant interfaces.

~~~cpp
struct FileIOArgs
{
    int fd;
    void *buf;
    size_t count;
    off_t offset;
};

...
using WFFileIOTask = WFFileTask<struct FileIOArgs>;
using fio_callback_t = std::function<void (WFFileIOTask *)>;
...

class WFTaskFactory
{
public:
    ...
    static WFFileIOTask *create_pread_task(int fd, void *buf, size_t count, off_t offset,
                                           fio_callback_t callback);

    static WFFileIOTask *create_pwrite_task(int fd, void *buf, size_t count, off_t offset,
                                            fio_callback_t callback);

    /* Interface with file path name */
	static WFFileIOTask *create_pread_task(const std::string& path, void *buf, size_t count, off_t offset,
                                           fio_callback_t callback);

    static WFFileIOTask *create_pwrite_task(const std::string& path, void *buf, size_t count, off_t offset,
                                            fio_callback_t callback);  
};
~~~

Both pread and pwrite return WFFileIOTask. We do not distinguish between sort and psort, and we do not distinguish between client and server task. They all follow the same principle.   
In addition to these two interfaces, preadv and pwritev return WFFileVIOTask; fsync and fdsync return WFFileSyncTask. You can see the details in the header file.   
The example uses the user\_data field of the task to save the global data of the service. For larger services, we recommend to use series context. You can see the [proxy examples](/tutorial/tutorial-05-http_proxy.cc) for details.

# Handling file reading results

~~~cpp
using namespace protocol;

void pread_callback(WFFileIOTask *task)
{
    FileIOArgs *args = task->get_args();
    long ret = task->get_retval();
    HttpResponse *resp = (HttpResponse *)task->user_data;

    /* close fd only when you created File IO task with **fd** interface. */
    close(args->fd);
    if (ret < 0)
    {
        resp->set_status_code("503");
        resp->append_output_body("<html>503 Internal Server Error.</html>");
    }
    else /* Use '_nocopy' carefully. */
        resp->append_output_body_nocopy(args->buf, ret);
}
~~~

Use **get\_args()** of the file task to get the input parameters. Here it is a FileIOArgs struct, and it's **fd** field will be -1 if the task was created with **pathname**.   
Use **get\_retval()** to get the return value of the operation. If ret < 0, the task fails. Otherwise, the ret is the size of the read data.   
In the file task, ret < 0 and task->get\_state()! = WFT\_STATE\_SUCCESS are completely equivalent.   
The memory of the buf domain is managed by ourselves. You can use  **append\_output\_body\_nocopy()** to pass that memory to resp.   
After the reply is completed, we will **free()** this block of memory with this line in the process:   
server\_task->set\_callback(\[](WFHttpTask \*t){ free(t->user\_data); });

# Interact with the server through command line

After the server is started, users may access it through command line. Simply input the file name that you want to get, or input Ctrl-D to end the program. The repeating process is implemnted by using WFRepeaterTask, which can be created by this factory function:
~~~cpp
using repeated_create_t = std::function<SubTask *(WFRepeaterTask *)>;
using repeater_callback_t = std::function<void (WFRpeaterTask *)>;

class WFTaskFactory
{
    WFRpeaterTask *create_repeater_task(repeated_create_t create, repeater_callback_t callback);
};
~~~
As above, a repeater task is created with a task creator function. The repeater calls the task creator repeatedly and run the task until the creator return a NULL pointer. When the using's input is not empty, our creator will create an HTTP task on IP 127.0.0.1 to access the server.  
~~~cpp
{
	auto&& create = [&scheme, port](WFRepeaterTask *)->SubTask *{
		...
		scanf("%1023s", buf);
		if (*buf == '\0')
			return NULL;

		std::string url = scheme + "127.0.0.1:" + std::to_string(port) + "/" + buf;
		WFHttpTask *task = WFTaskFactory::create_http_task(url, 0, 0,
									[](WFHttpTask *task) {
			...
		});

		return task;
	};
	
	WFFacilities::WaitGroup wg(1);
	WFRepeaterTask *repeater;
	repeater = WFTaskFactory::create_repeater_task(create, [&wg](WFRepeaterTask *) {
		wg.done();
	});

	repeater->start();
	wg.wait();

	server.stop();
}
~~~
Finally, when the creator returned NULL, the repeater's callback is called and the program will be ended.

# About the implementation of the file IO

Linux operating system supports a set of asynchronous IO system calls with high efficiency and very little CPU occupation. If you use our framework in a Linux system, this set of interfaces are used by default.   
We have implemented a set of posix aio interfaces to support other UNIX systems, and used the sigevent notification method of threads, but it is no longer in use because of its low efficiency.   
Currently, for non-Linux systems, asynchronous IO is always simulated by multi-threading. When an IO task arrives, a thread is created in real time to execute IO tasks, and then a callback is used to return to the handler thread pool.   
Multi-threaded IO is also the only choice in macOS, because macOS does not have good sigevent support and posix aio will not work in macOS.   
Some UNIX systems do not support fdatasync. In this case, an fdsync task is equivalent to an fsync task.