user/sven/linux.git/net/ceph, branch v3.4.48

rbd: remove linger unconditionally

2013-01-17T16:51:20Z

In __unregister_linger_request(), the request is being removed from the osd client's req_linger list only when the request has a non-null osd pointer. It should be done whether or not the request currently has an osd. This is most likely a non-issue because I believe the request will always have an osd when this function is called. Signed-off-by: Alex Elder Reviewed-by: Sage Weil (cherry picked from commit 61c74035626beb25a39b0273ccf7d75510bc36a1) Signed-off-by: Greg Kroah-Hartman

ceph: don't reference req after put

2013-01-17T16:51:20Z

In __unregister_request(), there is a call to list_del_init() referencing a request that was the subject of a call to ceph_osdc_put_request() on the previous line. This is not safe, because the request structure could have been freed by the time we reach the list_del_init(). Fix this by reversing the order of these lines. Signed-off-by: Alex Elder Reviewed-off-by: Sage Weil (cherry picked from commit 7d5f24812bd182a2471cb69c1c2baf0648332e1f) Signed-off-by: Greg Kroah-Hartman

libceph: remove 'osdtimeout' option

2013-01-17T16:51:20Z

This would reset a connection with any OSD that had an outstanding request that was taking more than N seconds. The idea was that if the OSD was buggy, the client could compensate by resending the request. In reality, this only served to hide server bugs, and we haven't actually seen such a bug in quite a while. Moreover, the userspace client code never did this. More importantly, often the request is taking a long time because the OSD is trying to recover, or overloaded, and killing the connection and retrying would only make the situation worse by giving the OSD more work to do. Signed-off-by: Sage Weil Reviewed-by: Alex Elder (cherry picked from commit 83aff95eb9d60aff5497e9f44a2ae906b86d8e88) Signed-off-by: Greg Kroah-Hartman

libceph: avoid using freed osd in __kick_osd_requests()

2013-01-17T16:51:20Z

If an osd has no requests and no linger requests, __reset_osd() will just remove it with a call to __remove_osd(). That drops a reference to the osd, and therefore the osd may have been free by the time __reset_osd() returns. That function offers no indication this may have occurred, and as a result the osd will continue to be used even when it's no longer valid. Change__reset_osd() so it returns an error (ENODEV) when it deletes the osd being reset. And change __kick_osd_requests() so it returns immediately (before referencing osd again) if __reset_osd() returns *any* error. Signed-off-by: Alex Elder Reviewed-by: Sage Weil (cherry picked from commit 685a7555ca69030739ddb57a47f0ea8ea80196a4) Signed-off-by: Greg Kroah-Hartman

libceph: fix osdmap decode error paths

2013-01-17T16:51:19Z

Ensure that we set the err value correctly so that we do not pass a 0 value to ERR_PTR and confuse the calling code. (In particular, osd_client.c handle_map() will BUG(!newmap)). Signed-off-by: Sage Weil Reviewed-by: Alex Elder (cherry picked from commit 0ed7285e0001b960c888e5455ae982025210ed3d) Signed-off-by: Greg Kroah-Hartman

libceph: fix protocol feature mismatch failure path

2013-01-17T16:51:19Z

We should not set con->state to CLOSED here; that happens in ceph_fault() in the caller, where it first asserts that the state is not yet CLOSED. Avoids a BUG when the features don't match. Since the fail_protocol() has become a trivial wrapper, replace calls to it with direct calls to reset_connection(). Signed-off-by: Sage Weil Reviewed-by: Alex Elder (cherry picked from commit 0fa6ebc600bc8e830551aee47a0e929e818a1868) Signed-off-by: Greg Kroah-Hartman

libceph: WARN, don't BUG on unexpected connection states

2013-01-17T16:51:19Z

A number of assertions in the ceph messenger are implemented with BUG_ON(), killing the system if connection's state doesn't match what's expected. At this point our state model is (evidently) not well understood enough for these assertions to trigger a BUG(). Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...) so we learn about these issues without killing the machine. We now recognize that a connection fault can occur due to a socket closure at any time, regardless of the state of the connection. So there is really nothing we can assert about the state of the connection at that point so eliminate that assertion. Reported-by: Ugis Tested-by: Ugis Signed-off-by: Alex Elder Reviewed-by: Sage Weil (cherry picked from commit 122070a2ffc91f87fe8e8493eb0ac61986c5557c) Signed-off-by: Greg Kroah-Hartman

libceph: always reset osds when kicking

2013-01-17T16:51:19Z

When ceph_osdc_handle_map() is called to process a new osd map, kick_requests() is called to ensure all affected requests are updated if necessary to reflect changes in the osd map. This happens in two cases: whenever an incremental map update is processed; and when a full map update (or the last one if there is more than one) gets processed. In the former case, the kick_requests() call is followed immediately by a call to reset_changed_osds() to ensure any connections to osds affected by the map change are reset. But for full map updates this isn't done. Both cases should be doing this osd reset. Rather than duplicating the reset_changed_osds() call, move it into the end of kick_requests(). Signed-off-by: Alex Elder Reviewed-by: Sage Weil (cherry picked from commit e6d50f67a6b1a6252a616e6e629473b5c4277218) Signed-off-by: Greg Kroah-Hartman

libceph: move linger requests sooner in kick_requests()

2013-01-17T16:51:19Z

The kick_requests() function is called by ceph_osdc_handle_map() when an osd map change has been indicated. Its purpose is to re-queue any request whose target osd is different from what it was when it was originally sent. It is structured as two loops, one for incomplete but registered requests, and a second for handling completed linger requests. As a special case, in the first loop if a request marked to linger has not yet completed, it is moved from the request list to the linger list. This is as a quick and dirty way to have the second loop handle sending the request along with all the other linger requests. Because of the way it's done now, however, this quick and dirty solution can result in these incomplete linger requests never getting re-sent as desired. The problem lies in the fact that the second loop only arranges for a linger request to be sent if it appears its target osd has changed. This is the proper handling for *completed* linger requests (it avoids issuing the same linger request twice to the same osd). But although the linger requests added to the list in the first loop may have been sent, they have not yet completed, so they need to be re-sent regardless of whether their target osd has changed. The first required fix is we need to avoid calling __map_request() on any incomplete linger request. Otherwise the subsequent __map_request() call in the second loop will find the target osd has not changed and will therefore not re-send the request. Second, we need to be sure that a sent but incomplete linger request gets re-sent. If the target osd is the same with the new osd map as it was when the request was originally sent, this won't happen. This can be fixed through careful handling when we move these requests from the request list to the linger list, by unregistering the request *before* it is registered as a linger request. This works because a side-effect of unregistering the request is to make the request's r_osd pointer be NULL, and *that* will ensure the second loop actually re-sends the linger request. Processing of such a request is done at that point, so continue with the next one once it's been moved. Signed-off-by: Alex Elder Reviewed-by: Sage Weil (cherry picked from commit ab60b16d3c31b9bd9fd5b39f97dc42c52a50b67d) Signed-off-by: Greg Kroah-Hartman

libceph: register request before unregister linger

2013-01-17T16:51:19Z

In kick_requests(), we need to register the request before we unregister the linger request. Otherwise the unregister will reset the request's osd pointer to NULL. Signed-off-by: Alex Elder Reviewed-by: Sage Weil (cherry picked from commit c89ce05e0c5a01a256100ac6a6019f276bdd1ca6) Signed-off-by: Greg Kroah-Hartman